Robust Feature Attribution via Integrated Sensitivity Gradients
Mar 2, 2026·
,
Rukmangadh Sai Myana
Sumit Kumar Jha
Corresponding Author
,Yanzhao Wu
Corresponding Author
·
0 min readAbstract
Robustness to perturbations and sampling noise remains a critical challenge in interpreting machine learning models, particularly for high-stakes applications where unstable explanations undermine trust and safety-critical decisions. We introduce Integrated Sensitivity Gradients (ISG), a unified attribution framework that delivers robust saliency maps by bridging game-theoretic and sensitivity analysis perspectives. ISG generalizes traditional variance-based sensitivity indices to capture higher-order statistical moments of neural network outputs including kurtosis. Through integration with the Aumann-Shapley value, ISG produces distribution-aware attributions with enhanced stability under perturbations. Evaluations on ImageNet demonstrate that ISG achieves superior robustness across multiple metrics without sacrificing fidelity, establishing a new foundation for reliable visual interpretation in critical domains.
Type
Publication
ICLR 2026 Workshop on Principled Design for Trustworthy AI - Interpretability, Robustness, and Safety across Modalities