Explainable Ensemble Learning for Robust Severity Stratification of Carpal Tunnel Syndrome from Clinical Data

Sahin, Muhammet; Ulutaş, HASAN; Korkmaz, MURAT; Ozbay Karakus, Mucella; Er, Orhan; Ünlüel, HURİYE

doi:10.3390/diagnostics16111604

Explainable Ensemble Learning for Robust Severity Stratification of Carpal Tunnel Syndrome from Clinical Data

Sahin M. E., Ulutaş H., Korkmaz M., Ozbay Karakus M., Er O., Ünlüel H.

Diagnostics, cilt.16, sa.11, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 16 Sayı: 11
Basım Tarihi: 2026
Doi Numarası: 10.3390/diagnostics16111604
Dergi Adı: Diagnostics
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, EMBASE, Directory of Open Access Journals, Academic Search Ultimate (EBSCO), Biomedical Reference Collection: Corporate Edition (EBSCO)
Anahtar Kelimeler: carpal tunnel syndrome (CTS), data augmentation, machine learning, SHAP, stacking ensemble, UN SDG 3
Yozgat Bozok Üniversitesi Adresli: Evet

Özet

Background/Objectives: This paper aims to design an explainable and accurate ML framework to support the automatic classification of Carpal Tunnel Syndrome (CTS) severity from structured patient data. Methods: For the experiment, an open-source dataset of 1037 samples was used. Following stratified partitioning, 305 samples were held out as the test set; the remaining training set (n = 732) was augmented to 1216 balanced samples via ADASYN, yielding an 80/20 train/test ratio relative to the final dataset (n = 1521). In order to solve the problem of imbalance associated with CTS cases of moderate and severe severity, the Adaptive Synthetic Sampling (ADASYN) technique was employed. The model’s predictive capacity was increased by means of feature engineering methods, such as polynomial transformations and clinically relevant interactions. Specifically, four ensemble learning models (XGBoost, Random Forest, LightGBM, and CatBoost) were optimized and ensembled with the use of a stacking approach with a base algorithm of LightGBM. The explainability of the model was ensured through SHAP and LIME analysis. Results: As a result, the stacking ensemble was able to reach a test accuracy of 91.15%, an F1-score of 91.13%, and an ROC-AUC of 0.9708. The proposed ensemble performed superiorly compared to any other individual algorithm while having stable performance across all severity categories. Conclusions: Through the explainability analysis, it was observed that such a classification model relies on important clinically relevant predictors, including cross-sectional area (CSA), duration of symptoms, pain level measured by the numeric rating scale of pain (NRS), and palmar bowing (PB).