Interpretable Voice-Based Machine Learning Model for Early Detection of Parkinson’s Disease
Contributors
Swapnita Srivastava
Pawan Wing
Prof Dr Divya Midhun
Keywords
Proceeding
Track
Engineering and Sciences
License
Copyright (c) 2026 Sustainable Global Societies Initiative

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Abstract
Parkinson’s Disease (PD) is a progressive neurodegenerative disorder that severely impacts motor control and speech fluency. Early detection remains a clinical challenge due to the subtle onset of symptoms. This study presents an explainable machine learning framework for the automated detection of Parkinson’s Disease using voice-derived features and Recursive Feature Elimination (RFE) for optimal feature selection. The UCI Parkinson’s dataset comprising 195 voice samples (147 PD, 48 healthy) was used to train and evaluate multiple classifiers, including Random Forest, Gradient Boosting, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Logistic Regression. The models were assessed through accuracy, precision, recall, and F1-score metrics, with hyperparameter tuning performed to enhance performance. Experimental results demonstrate that the proposed RFE-based ensemble framework achieved superior accuracy compared to individual classifiers while maintaining interpretability through feature importance visualization. Prominent acoustic biomarkers such as jitter, shimmer, NHR, and HNR emerged as critical predictors of PD, supporting their diagnostic relevance. The proposed explainable approach offers a robust, transparent, and clinically interpretable pathway for early Parkinson’s detection using non-invasive voice data