Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

2017-11-30ICML 2018Code Available0· sign in to hype

Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, Rory Sayres

Code Available — Be the first to reproduce this paper.

Code

github.com/tensorflow/tcav
Officialtf★ 0
github.com/google-research/mood-board-search
tf★ 0
github.com/maragraziani/iMIMIC-RCVs
tf★ 0
github.com/medgift/iMIMIC-RCVs
tf★ 0
github.com/jwendyr/tcav
tf★ 0
github.com/giovannimaffei/concept_activation_vectors
tf★ 0
github.com/soumyadip1995/TCAV
none★ 0
github.com/fursovia/tcav_nlp
tf★ 0
github.com/mbakler/Tcav_pytorch_implementation
pytorch★ 0
github.com/pnxenopoulos/cav-keras
none★ 0

Abstract

The interpretation of deep learning models is a challenge due to their size, complexity, and often opaque internal state. In addition, many systems, such as image classifiers, operate on low-level features rather than high-level concepts. To address these challenges, we introduce Concept Activation Vectors (CAVs), which provide an interpretation of a neural net's internal state in terms of human-friendly concepts. The key idea is to view the high-dimensional internal state of a neural net as an aid, not an obstacle. We show how to use CAVs as part of a technique, Testing with CAVs (TCAV), that uses directional derivatives to quantify the degree to which a user-defined concept is important to a classification result--for example, how sensitive a prediction of "zebra" is to the presence of stripes. Using the domain of image classification as a testing ground, we describe how CAVs may be used to explore hypotheses and generate insights for a standard image classification network as well as a medical application.

Tasks

General Classification image-classification Image Classification

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

Code

Abstract

Tasks

Reproductions