A Multimodal Machine and Deep Learning Approach for Explainable Early Detection of Sarcoma
Contributors
Vaishali Rajput
Raja Sarath Kumar Boddu3
Keywords
Proceeding
Track
Engineering, Sciences, Mathematics & Computations
License
Copyright (c) 2025 Sustainable Global Societies Initiative

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Abstract
Sarcoma is a rare and heterogeneous group of malignancies originating from mesenchymal tissues, posing significant challenges for early diagnosis due to deep anatomical localization, biological diversity, and limited availability of large-scale datasets. This study presents a fully revised and expanded AI-enabled multimodal framework for early sarcoma detection and subtype classification. The proposed approach integrates radiological imaging, genomic profiles, and clinical metadata using machine learning and deep learning techniques. A redesigned architecture with modality-specific encoders, attention-based fusion, and explainable AI modules is introduced to improve interpretability and diagnostic confidence. Experiments conducted on TCIA and TCGA-SARC datasets demonstrate that the multimodal model achieves superior performance, with an accuracy of 94.3% and an AUC-ROC of 0.951, outperforming unimodal approaches. The results highlight the importance of multimodal integration for precision diagnostics in rare cancers.