The Effect of Model Size on LLM Post-hoc Explainability via LIME

2024-05-08Code Available0· sign in to hype

Henning Heyen, Amy Widdicombe, Noah Y. Siegel, Maria Perez-Ortiz, Philip Treleaven

Code Available — Be the first to reproduce this paper.

Code

github.com/henningheyen/scalability-of-llm-posthoc-explanations
OfficialIn paperpytorch★ 3

Abstract

Large language models (LLMs) are becoming bigger to boost performance. However, little is known about how explainability is affected by this trend. This work explores LIME explanations for DeBERTaV3 models of four different sizes on natural language inference (NLI) and zero-shot classification (ZSC) tasks. We evaluate the explanations based on their faithfulness to the models' internal decision processes and their plausibility, i.e. their agreement with human explanations. The key finding is that increased model size does not correlate with plausibility despite improved model performance, suggesting a misalignment between the LIME explanations and the models' internal processes as model size increases. Our results further suggest limitations regarding faithfulness metrics in NLI contexts.

Tasks

Natural Language Inference zero-shot-classification Zero-Shot Learning

The Effect of Model Size on LLM Post-hoc Explainability via LIME

Code

Abstract

Tasks

Reproductions