Interpretability Techniques for Deep Learning

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 25 papers

Title	Date	Tasks	Status	Hype
CausalGym: Benchmarking causal interpretability methods on linguistic tasks	Feb 19, 2024	BenchmarkingInterpretability Techniques for Deep Learning	CodeCode Available	2
Less is More: Fewer Interpretable Region via Submodular Subset Selection	Feb 14, 2024	Error UnderstandingImage Attribution	CodeCode Available	2
Time series saliency maps: explaining models across multiple domains	May 19, 2025	Explainable Artificial Intelligence (XAI)Interpretability Techniques for Deep Learning	CodeCode Available	1
Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability	Mar 26, 2025	Age/UnbiasedDecision Making	CodeCode Available	1
TraceFL: Interpretability-Driven Debugging in Federated Learning via Neuron Provenance	Dec 21, 2023	Explainable ModelsFault localization	CodeCode Available	1
Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure	Jun 13, 2022	Error UnderstandingImage Attribution	CodeCode Available	1
Learning the Dynamics of Physical Systems from Sparse Observations with Finite Element Networks	Mar 16, 2022	Graph Neural NetworkInterpretability Techniques for Deep Learning	CodeCode Available	1
A Novel Deep Learning Model for Hotel Demand and Revenue Prediction amid COVID-19	Mar 8, 2022	Correlated Time Series ForecastingCOVID-19 Modelling	CodeCode Available	1
DISSECT: Disentangled Simultaneous Explanations via Concept Traversals	May 31, 2021	counterfactualFairness	CodeCode Available	1
Exploration of Interpretability Techniques for Deep COVID-19 Classification using Chest X-ray Images	Jun 3, 2020	COVID-19 DiagnosisGeneral Classification	CodeCode Available	1
RISE: Randomized Input Sampling for Explanation of Black-box Models	Jun 19, 2018	Explainable Artificial Intelligence (XAI)Feature Importance	CodeCode Available	1
A Unified Approach to Interpreting Model Predictions	May 22, 2017	Feature ImportanceImage Attribution	CodeCode Available	1
Axiomatic Attribution for Deep Networks	Mar 4, 2017	Explainable artificial intelligenceImage Attribution	CodeCode Available	1
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization	Oct 7, 2016	General ClassificationImage Attribution	CodeCode Available	1
"Why Should I Trust You?": Explaining the Predictions of Any Classifier	Feb 16, 2016	Image Attributionimage-classification	CodeCode Available	1
IBO: Inpainting-Based Occlusion to Enhance Explainable Artificial Intelligence Evaluation in Histopathology	Aug 29, 2024	ClassificationDenoising	CodeCode Available	0
Explainable Deep Learning: A Visual Analytics Approach with Transition Matrices	Mar 29, 2024	Deep LearningExplainable Artificial Intelligence (XAI)	CodeCode Available	0
Improving Interpretability via Regularization of Neural Activation Sensitivity	Nov 16, 2022	Adversarial RobustnessExplanation Fidelity Evaluation	—Unverified	0
A Semi-supervised Deep Transfer Learning Approach for Rolling-Element Bearing Remaining Useful Life Prediction	Sep 29, 2021	Interpretability Techniques for Deep LearningManagement	CodeCode Available	0
A deep supervised learning approach for condition-based maintenance of naval propulsion systems Tarek	Dec 29, 2020	Interpretability Techniques for Deep Learning	—Unverified	0
DeepNNK: Explaining deep models and their generalization using polytope interpolation	Jul 20, 2020	BIG-bench Machine LearningInterpretability Techniques for Deep Learning	CodeCode Available	0
An Investigation of Interpretability Techniques for Deep Learning in Predictive Process Analytics	Feb 21, 2020	Decision MakingInterpretability Techniques for Deep Learning	—Unverified	0
What Do Compressed Deep Neural Networks Forget?	Nov 13, 2019	FairnessInterpretability Techniques for Deep Learning	CodeCode Available	0
Contextual Explanation Networks	May 29, 2017	Image ClassificationInterpretability Techniques for Deep Learning	CodeCode Available	0
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps	Dec 20, 2013	General ClassificationImage Attribution	CodeCode Available	0

Show:10 25 50

All datasets CausalGym CelebA

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DAS	Log odds-ratio (pythia-6.9b)	9.95	—	Unverified
2	Linear probe	Log odds-ratio (pythia-6.9b)	3.42	—	Unverified
3	Difference-in-means	Log odds-ratio (pythia-6.9b)	2.91	—	Unverified
4	k-means	Log odds-ratio (pythia-6.9b)	1.87	—	Unverified
5	PCA	Log odds-ratio (pythia-6.9b)	1.81	—	Unverified
6	LDA	Log odds-ratio (pythia-6.9b)	0.27	—	Unverified
7	Random	Log odds-ratio (pythia-6.9b)	0.01	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RISE	Insertion AUC score	0.57	—	Unverified
2	HSIC-Attribution	Insertion AUC score	0.57	—	Unverified
3	Kernel SHAP	Insertion AUC score	0.52	—	Unverified
4	LIME	Insertion AUC score	0.52	—	Unverified
5	Saliency	Insertion AUC score	0.46	—	Unverified
6	Grad-CAM	Insertion AUC score	0.37	—	Unverified
7	Integrated Gradients	Insertion AUC score	0.36	—	Unverified