Content area

Abstract

Bayesian approaches to decision trees (DTs) using Markov Chain Monte Carlo (MCMC) samplers have recently demonstrated state-of-the-art accuracy performance when it comes to training DTs to solve classification problems. Despite the competitive classification accuracy, MCMC requires a potentially long runtime to converge. A widely used approach to reducing an algorithm’s runtime is to employ modern multi-core computer architectures, either with shared memory (SM) or distributed memory (DM), and use parallel computing to accelerate the algorithm. However, the inherent sequential nature of MCMC makes it unsuitable for parallel implementation unless the accuracy is sacrificed. This issue is particularly evident in DM architectures, which normally provide access to larger numbers of cores than SM. Sequential Monte Carlo (SMC) samplers are a parallel alternative to MCMC, which do not trade off accuracy for parallelism. However, the performance of SMC samplers in the context of DTs is underexplored, and the parallelization is complicated by the challenges in parallelizing its bottleneck, namely redistribution, especially on variable-size data types such as DTs. In this work, we study the problem of parallelizing SMC in the context of DTs both on SM and DM. On both memory architectures, we show that the proposed parallelization strategies achieve asymptotically optimal O(log2N) time complexity. Numerical results are presented for a 32-core SM machine and a 256-core DM cluster. For both computer architectures, the experimental results show that our approach has comparable or better accuracy than MCMC but runs up to 51 times faster on SM and 640 times faster on DM. In this paper, we share the GitHub link to the source code.

Details

1009240
Title
A Massively Parallel SMC Sampler for Decision Trees
Author
Drousiotis, Efthyvoulos 1   VIAFID ORCID Logo  ; Varsi, Alessandro 1   VIAFID ORCID Logo  ; Phillips, Alexander M 1   VIAFID ORCID Logo  ; Maskell, Simon 1   VIAFID ORCID Logo  ; Spirakis, Paul G 2   VIAFID ORCID Logo 

 Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool L69 3GJ, UK; [email protected] (A.M.P.); [email protected] (S.M.) 
 Department of Computer Science, University of Liverpool, Liverpool L69 3BX, UK 
Publication title
Algorithms; Basel
Volume
18
Issue
1
First page
14
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
19994893
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-01-02
Milestone dates
2024-10-03 (Received); 2024-12-10 (Accepted)
Publication history
 
 
   First posting date
02 Jan 2025
ProQuest document ID
3159221655
Document URL
https://www.proquest.com/scholarly-journals/massively-parallel-smc-sampler-decision-trees/docview/3159221655/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-01-24
Database
ProQuest One Academic