Effective Predictive model for diabetes classification using optimized machine learning on imbalanced dataset


Date Published : 8 May 2026

Contributors

Dr. G. R. Ashisha

Karunya Institute of Technology and Sciences
Author

Sai Kiran

Author

Keywords

SMOTE LightGBM Machine Learning Diabetes; Ensemble Technique

Proceeding

Track

Engineering and Sciences

License

Copyright (c) 2026 Sustainable Global Societies Initiative

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Abstract

Diabetes mellitus is a growing health issue that demands precise and early detection. In recent research, machine learning techniques have shown promising results in assisting medical practitioners with the prediction of the disease. However, the presence of class imbalance and missing values is a common issue with real-world medical datasets, which may impact the prediction performance. In this paper, the authors propose an optimized machine learning framework that uses a class-balanced dataset generated with the Synthetic Minority Over-sampling Technique. The proposed framework is experimented with the Diabetes 130-US Hospitals dataset. Data preprocessing techniques were used to improve the quality of the dataset. Various machine learning algorithms were used, including CatBoost, XGBoost, LightGBM, and stacking. In the experimentation process, the authors observed that the use of ensemble-based machine learning techniques resulted in better classification performance.

References

No References

Downloads

How to Cite

Dr. G. R. Ashisha, D. G. R. A., & Oruganti, S. K. (2026). Effective Predictive model for diabetes classification using optimized machine learning on imbalanced dataset. Sustainable Global Societies Initiative, 1(4). https://vectmag.com/sgsi/paper/view/344