Explainable Deep Learning for Multi-Class Brain Tumor Classification: A Comparative Study of Visual Interpretability
Date: January 2026 Repository: Explainable-Brain-MRI
Deep learning has achieved dermatologist-level accuracy in medical image classification. However, the lack of transparency in "black-box" models hinders their adoption in clinical workflows, where understanding the rationale behind a diagnosis is as critical as the diagnosis itself. This report presents a robust pipeline for multi-class brain tumor classification (Glioma, Meningioma, Pituitary, No Tumor) using T1-weighted MRI scans. We integrate Grad-CAM (Gradient-weighted Class Activation Mapping) to generate visual explanations, allowing for the validation of model focus against radiological ground truth. Our approach aims to bridge the gap between high-performance ML and clinical trust, aligning with industry standards for transparent AI in healthcare.
Brain tumors, including Gliomas and Meningiomas, require rapid and accurate triage. Structural MRI, particularly T1-weighted imaging, is the standard modality for assessing anatomical boundaries and classifying tissue. While manual interpretation is time-consuming and subject to inter-observer variability, automated systems must provide more than just a probability score to be clinically useful.
Convolutional Neural Networks (CNNs) are powerful but opaque. In high-stakes domains like neuro-oncology, a false positive driven by image artifacts (e.g., a skull label or scanner noise) can have severe consequences. Explainable AI (XAI) addresses this by visualizing the features driving the model's predictions.
Recent literature highlights the converging trend of high-accuracy CNNs and post-hoc explainability:
- Iftikhar et al. (2025) demonstrated that while CNNs can reach 99% accuracy in tumor detection, the integration of SHAP and Grad-CAM is essential for validating that the model is detecting the tumor and not background noise.
- Gharaibeh et al. (2025) utilized Xception-based netwoks with Grad-CAM to differentiate between tumor subtypes, emphasizing the need for region-based visual explanations.
- Islam et al. (2025) proposed Grad-CAM++ improvements for better localization of lesion boundaries, effectively "segmenting" the tumor without explicit segmentation labels.
Our work builds on these foundations, providing a streamlined, reproducible framework for applying these techniques to standard T1-weighted MRI datasets.
We utilize a widely recognized public dataset comprising MRI scans categorized into four classes:
- Glioma: Glial cell tumors, often with irregular boundaries.
- Meningioma: Typically well-circumscribed dural-based tumors.
- Pituitary: Tumors in the sellar region.
- No Tumor: Healthy brain tissue.
Preprocessing: Images are resized to 224x224, normalized using ImageNet statistics (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), and augmented with random rotations and flips to improve generalization.
We employ ResNet18 with transfer learning. The choice of ResNet18 is deliberate:
- Efficiency: It offers a lightweight architecture suitable for rapid inference.
- Residual Learning: Mitigates the vanishing gradient problem, allowing for effective feature extraction from complex medical images.
- Transfer Learning: Pre-training on ImageNet provides robust low-level feature detectors (edges, textures) which are then fine-tuned for high-level MRI features.
We implement Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize saliency.
Mathematically, for a given class
- We compute the neuron importance weights
$\alpha_k^c$ by global average pooling the gradients of the score$y^c$ with respect to feature maps$A^k$ : $$ \alpha_k^c = \frac{1}{Z} \sum_i \sum_j \frac{\partial y^c}{\partial A_{ij}^k} $$ - We compute the weighted combination of forward activation maps, followed by a ReLU to keep only features that have a positive influence on the class of interest: $$ L_{Grad-CAM}^c = ReLU\left(\sum_k \alpha_k^c A^k\right) $$
This results in a coarse heatmap of the same size as the convolutional feature maps (e.g., 7x7), which is then upsampled to the input image resolution (224x224) for overlay.
By overlaying Grad-CAM heatmaps on T1-weighted images, we can verify "Clinical Validity":
- Success Case: The heatmap lights up the hyperintense tumor region.
- Failure Case: The heatmap focuses on the skull, eyes, or text annotations on the scan. This verification step is crucial for deploying models in real-world settings where "Right for the wrong reasons" is unacceptable.
For pharmaceutical and medical device companies, this pipeline demonstrates:
- Regulatory Readiness: XAI is increasingly requested by regulatory bodies (FDA, EMA) for AI-based Medical Drives (SaMD).
- Workflow Integration: The ability to present a radiologist with a "second opinion" supported by visual evidence.
- Quantitative XAI Evaluation: Measuring the IoU (Intersection over Union) between the Grad-CAM heatmap and manual tumor segmentation masks.
- Perturbation Methods: Integrating LIME (Local Interpretable Model-agnostic Explanations) to test model robustness against noise.
- Multimodal Fusion: Combining T1 with T2/FLAIR images for richer input data.
This project establishes a baseline for Explainable Brain MRI Classification. By prioritizing interpretability alongside accuracy, we provide a framework that is not only valid from a machine learning perspective but also meaningful and trustworthy for clinical stakeholders.