ECHo: A Visio-Linguistic Dataset for Event Causality Inference via Human-Centric Reasoning
Yuxi Xie, Guanzhen Li, Min-Yen Kan
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/yuxixie/echoOfficialIn papernone★ 8
Abstract
We introduce ECHo (Event Causality Inference via Human-Centric Reasoning), a diagnostic dataset of event causality inference grounded in visio-linguistic social scenarios. ECHo employs real-world human-centric deductive information building on a television crime drama. ECHo requires the Theory-of-Mind (ToM) ability to understand and reason about social interactions based on multimodal information. Using ECHo, we propose a unified Chain-of-Thought (CoT) framework to assess the reasoning capability of current AI systems. Our ToM-enhanced CoT pipeline accommodates various large foundation models in both zero-shot and few-shot visio-linguistic reasoning. We use this framework to scrutinize recent large foundation models such as InstructGPT and MiniGPT-4 on three diagnostic human-centric tasks. Further analysis demonstrates ECHo as a challenging dataset to expose imperfections and inconsistencies in reasoning. Our data and code are publicly available at https://github.com/YuxiXie/ECHo.