Explainable Knowledge Distillation via Capsule Vision Transformers for Automated Kidney Disease Categorization


Date Published : 9 January 2026

Contributors

Mr. Sachin Dattatraya Shingade

Lincoln University College Petaling Jaya Malaysia
Author

Mr. Midhun Chakkaravarthy

Lincoln University College Petaling Jaya Malaysia
Author

Mr. Dimitrios A. Karras

LUCM, NKUA Athens Greece and EUT Albania
Author

Mr. Sachin S.

Pune Institute of Computer Technology PICT Pune, Maharashtra, India
Author

Miss. Komal Mahadeo Masal

Pune Institute of Computer Technology PICT Pune, Maharashtra, India
Author

Keywords

CT-Kidney Dataset; CapViT-DKD (Capsule Vision Transformer with Diverse Knowledge Distillation); PCG-CAM (Explainable AI); Teacher-Student Learning Architecture; Medical Image Classification; Self-Supervised Learning. ________________________________________

Proceeding

Track

Engineering, Sciences, Mathematics & Computations

License

Copyright (c) 2026 Sustainable Global Societies Initiative

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Abstract

Abstract:   Kidney disease, characterized by the progressive decline in the renal system's ability to filter metabolic waste and excess fluids, poses a significant global health risk. When these physiological impairments persist beyond a three-month threshold, the condition is classified as Chronic Kidney Disease (CKD). Current diagnostic frameworks often struggle with high computational overhead, suboptimal precision, and a lack of architectural efficiency for deployment on lightweight devices. To resolve these challenges, this study introduces CapViT-DKD , an advanced kidney detection and classification framework utilizing a Capsule Vision Transformer integrated with Self-Supervised Diverse Knowledge Distillation. The methodology utilizes the CT-kidney dataset, initiated by a robust preprocessing pipeline comprising image resizing, normalization, and strategic data augmentation. The core architecture employs a teacher-student paradigm: a high-capacity Perceptive Capsule Transformer Network (PCapTN) serves as the teacher, transferring complex feature representations to a Lightweight Capsule Transformer Network (LCapTN) student model. This diverse knowledge distillation (DKD) approach significantly boosts the student model’s performance while maintaining a small computational footprint. To address the "black box" nature of deep learning, we incorporate Principal Component of Gradient-Class Activation Mapping (PCG-CAM), which provides visual explanations by highlighting the specific anatomical regions influencing the diagnostic output. Empirical results demonstrate that the proposed system achieves superior performance metrics, including an accuracy of 99.75%, precision of 99.50%, Recall of 99.15 % and an F1-score of 99.10%, validating its efficacy for clinical decision support

References

No References

Downloads

How to Cite

Shingade, S. . D., -, M. C. ., Karras, D. A. . . ., -, S. S., & Masal, K. M. (2026). Explainable Knowledge Distillation via Capsule Vision Transformers for Automated Kidney Disease Categorization. Sustainable Global Societies Initiative, 1(1). https://vectmag.com/sgsi/paper/view/86