Hybrid deep learning model for maize leaf disease classification with explainable AI


Creative Commons License

Özüpak Y., Alpsalaz F., Aslan E., Uzel H.

New Zealand Journal of Crop and Horticultural Science, cilt.53, sa.5, ss.2942-2964, 2025 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 53 Sayı: 5
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1080/01140671.2025.2519570
  • Dergi Adı: New Zealand Journal of Crop and Horticultural Science
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Periodicals Index Online, Agricultural & Environmental Science Database, Aquatic Science & Fisheries Abstracts (ASFA), BIOSIS, CAB Abstracts, Food Science & Technology Abstracts, Geobase, Veterinary Science Database
  • Sayfa Sayıları: ss.2942-2964
  • Anahtar Kelimeler: explainable AI, hybrid deep learning, Maize leaf disease, MobileNetv2, vision transformer
  • Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
  • Yozgat Bozok Üniversitesi Adresli: Evet

Özet

This study presents a hybrid learning model that integrates MobileNetV2 and Vision Transformer (ViT) with a stacking model to classify maize leaf diseases, addressing the critical need for early detection to improve agricultural productivity and sustainability. Utilising the ‘Corn or Maize Leaf Disease Dataset’ from Kaggle, comprising 4,062 high-resolution images across five classes (Common Rust, Grey Leaf Spot, Healthy, Northern Leaf Blight, Not Maize Leaf), the model achieves an impressive accuracy of 96.73%. Transfer learning from ImageNet, coupled with data augmentation (rotation, flipping, scaling, brightness adjustment), enhances generalisation, while a 20% dropout rate mitigates overfitting. The key advantage of the hybrid model lies in its ability to combine the strengths of MobileNetV2's localised feature extraction and ViT's global context understanding, enhanced by the stacking model's ability to reduce the weaknesses of either model. Explainable AI techniques, including SHapley Additive exPlanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), and Grad-CAM, provide transparent insights into model decisions, fostering trust among agricultural stakeholders. Comparative analysis demonstrates the model’s superiority over prior works, with F1-scores ranging from 0.9276 to 1.0000. Despite minor misclassifications due to visual similarities, the model offers a robust, interpretable solution for precision agriculture.