A hybrid model with feature selection and hyper parameters for detecting diabetes in PIMA Indian dataset


Açıkyürek C., Çınarer G.

3 rd International Conference on Innovative Academic Studies, Konya, Türkiye, 26 - 28 Eylül 2023, cilt.1, sa.1, ss.428-435

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 1
  • Basıldığı Şehir: Konya
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.428-435
  • Yozgat Bozok Üniversitesi Adresli: Evet

Özet

Diabetes is a prevalent global health concern, with the timely detection of the disease playing a crucial role in treatment and prevention. Artificial Intelligence (AI) and Machine Learning (ML) algorithms have gained prominence due to their ability to analyze large datasets, aiding in disease diagnosis and treatment. This study focuses on developing accurate models for the early diagnosis of diabetes. We explored the performance of various ML algorithms, including K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Logistic Regression (LR), Extra Trees (ET), AdaBoost (AB), and Gradient Boosting (GB) while also employing different preprocessing techniques, hyperparameter tuning, XGBoost feature selection and crossover strategies. Furthermore, we tested a hybrid model using validation scenarios to assess its effectiveness. The study's outcomes revealed that the Logistic Regression algorithm achieved the highest classification accuracy, reaching 77%. This result highlights the potential of ML techniques, particularly Logistic Regression, in early diabetes diagnosis.