Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Background: Parkinson’s disease (PD) is the second most common neurodegenerative disorder after Alzheimer’s disease, affecting countless individuals worldwide. PD is characterized by the onset of a marked motor symptomatology in association with several non-motor manifestations. The clinical phase of the disease is usually preceded by a long prodromal phase, devoid of overt motor symptomatology but often showing some conditions such as sleep disturbance, constipation, anosmia, and phonatory changes. To date, speech analysis appears to be a promising digital biomarker to anticipate even 10 years before the onset of clinical PD, as well serving as a useful prognostic tool for patient follow-up. That is why, the voice can be nominated as the non-invasive method to detect PD from healthy subjects (HS). Methods: Our study was based on cross-sectional study to analysis voice impairment. A dataset comprising 81 voice samples (41 from healthy individuals and 40 from PD patients) was utilized to train and evaluate common machine learning (ML) models using various types of features, including long-term (jitter, shimmer, and cepstral peak prominence (CPP)), short-term features (Mel-frequency cepstral coefficient (MFCC)), and non-standard measurements (pitch period entropy (PPE) and recurrence period density entropy (RPDE)). The study adopted multiple machine learning (ML) algorithms, including random forest (RF), K-nearest neighbors (KNN), decision tree (DT), naïve Bayes (NB), support vector machines (SVM), and logistic regression (LR). Cross-validation technique was applied to ensure the reliability of performance metrics on train and test subsets. These metrics (accuracy, recall, and precision), help determine the most effective models for distinguishing PD from healthy subjects. Result: Among all the algorithms used in this research, random forest (RF) was the best-performing model, achieving an accuracy of 82.72% with a ROC-AUC score of 89.65%. Although other models, such as support vector machine (SVM), could be considered with an accuracy of 75.29% and a ROC-AUC score of 82.63%, RF was by far the best one when evaluated across all metrics. The K-nearest neighbor (KNN) and decision tree (DT) performed the worst. Notably, by combining a comprehensive set of long-term, short-term, and non-standard acoustic features, unlike previous studies that typically focused on only a subset, our study achieved higher predictive performance, offering a more robust model for early PD detection. Conclusions: This study highlights the potential of combining advanced acoustic analysis with ML algorithms to develop non-invasive and reliable tools for early PD detection, offering substantial benefits for the healthcare sector.

Details

Title
Prediction of Parkinson Disease Using Long-Term, Short-Term Acoustic Features Based on Machine Learning
Author
Rashidi Mehdi 1   VIAFID ORCID Logo  ; Arima Serena 2 ; Stetco, Andrea Claudio 3 ; Coppola Chiara 1 ; Musarò Debora 4 ; Greco, Marco 4   VIAFID ORCID Logo  ; Damato, Marina 4   VIAFID ORCID Logo  ; My Filomena 5 ; Lupo, Angela 5 ; Lorenzo, Marta 5 ; Danieli, Antonio 4 ; Maruccio Giuseppe 6   VIAFID ORCID Logo  ; Argentiero Alberto 7 ; Buccoliero Andrea 8   VIAFID ORCID Logo  ; Donzella Marcello Dorian 9 ; Maffia, Michele 4   VIAFID ORCID Logo 

 Department of Mathematics and Physics “E. De Giorgi”, University of Salento, Via Lecce—Arnesano, 73100 Lecce, Italy; [email protected] (M.R.); [email protected] (C.C.); [email protected] (G.M.) 
 Department of Human and Social Sciences, University of Salento, 73100 Lecce, Italy; [email protected] 
 Department of Biological and Environmental Science and Technology, University of Salento, Via Lecce—Monteroni, 73100 Lecce, Italy; [email protected] 
 Department of Experimental Medicine, University of Salento, Via Lecce—Monteroni, 73100 Lecce, Italy; [email protected] (D.M.); [email protected] (M.G.); [email protected] (M.D.); [email protected] (A.D.) 
 Division of Neurology, Vito Fazzi Hospital, 73100 Lecce, Italy; [email protected] (F.M.); [email protected] (A.L.); [email protected] (M.L.) 
 Department of Mathematics and Physics “E. De Giorgi”, University of Salento, Via Lecce—Arnesano, 73100 Lecce, Italy; [email protected] (M.R.); [email protected] (C.C.); [email protected] (G.M.), Institute of Nanotechnology, CNR—Nanotec and INFN Sezione di i Lecce, via-ia Per Monteroni, 73100 Lecce, Italy 
 Department of Medicine and Surgery, University of Parma, 43121 Parma, Italy; [email protected] 
 Department of Research and Development (R&D), GPI S.p.A., 38123 Trento, [email protected] (M.D.D.), Human Science Department, University of Verona, 37129 Verona, Italy 
 Department of Research and Development (R&D), GPI S.p.A., 38123 Trento, [email protected] (M.D.D.) 
First page
739
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20763425
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3233099993
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.