Resumen
Diabetes is a persistent health condition that impacts millions of people globally. Early and accurate prediction of this disease is critical for prevention and effective management. Machine learning models have emerged as promising tools for this task; however, the variability in the performance of different algorithms requires a thorough evaluation to identify the most effective ones. The main objective of this study was to assess several machine learning models using different performance metrics to identify the most robust and consistent approaches to diabetes prediction. Nine machine learning models were evaluated using the Pima Indian dataset, with data balancing performed via Synthetic Minority Over-sampling Technique (SMOTE) and performance assessed through cross-validation and test data. Among the models, Random Forest and AdaBoost produced the most robust and consistent results across key metrics, such as the AUC-ROC and AUPRC. These findings highlight their potential use in clinical decision support systems for early risk detection and improved patient management. In conclusion, the study emphasizes the significance of utilizing various evaluation metrics to obtain a thorough insight into the performance of machine learning models in predicting diabetes.
| Idioma original | Inglés estadounidense |
|---|---|
| Páginas (desde-hasta) | 1795-1803 |
| - | 9 |
| Publicación | Ingenierie des Systemes d'Information |
| Volumen | 30 |
| N.º | 7 |
| DOI | |
| Estado | Indizado - 2025 |
Nota bibliográfica
Publisher Copyright:© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license