iPIC-XAI: Improving PIC-XAI for Enhanced Image Captioning Explanation
Modafar Al-Shouha, Gábor Szűcs
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/modafarshouha/PIC-XAIpytorch★ 1
Abstract
Image captioning task with its complexity has taken advantage of the recent developments in Deep learning (DL). However, DL-models are fundamentally abstruse and explaining their behaviour is a challenge. In this paper we present an algorithm to explain an image captioning model behavior. We enhanced PIC-XAI performance by introducing various components; (1) we utilize CLIP multimodal similarity for more efficiency, (2) we consider the query dependency tag as a clue for elements with bigger size, (3) we provide an algorithm to automatically set the preprocessing blurring kernel size, and (4) we use a similarity comparison technique to get more relevant answers. Additionally, we provide an improved version of XIC metric, aiming for more consistent objective evaluation for XAI methods in image captioning field.