Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

2024-03-26Code Available0· sign in to hype

Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu

Code Available — Be the first to reproduce this paper.

Code

github.com/MAGICS-LAB/Chain-of-Actions
Officialpytorch★ 7

Abstract

We present a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented Question-Answering (QA). Compared to the literature, CoA overcomes two major challenges of current QA applications: (i) unfaithful hallucination that is inconsistent with real-time or domain facts and (ii) weak reasoning performance over compositional information. Our key contribution is a novel reasoning-retrieval mechanism that decomposes a complex question into a reasoning chain via systematic prompting and pre-designed actions. Methodologically, we propose three types of domain-adaptable `Plug-and-Play' actions for retrieving real-time information from heterogeneous sources. We also propose a multi-reference faith score (MRFS) to verify and resolve conflicts in the answers. Empirically, we exploit both public benchmarks and a Web3 case study to demonstrate the capability of CoA over other methods.

Tasks

Hallucination Information Retrieval Question Answering Retrieval

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
FEVER	CoA w/o actions	EM	54.2	—	Unverified
FEVER	DSP	EM	62.2	—	Unverified
FEVER	Self-Ask	EM	64.2	—	Unverified
FEVER	CoA	EM	68.9	—	Unverified
FEVER	Zero-shot	EM	50	—	Unverified
StrategyQA	CoA	EM	79.2	—	Unverified
StrategyQA	SearchChain	EM	77	—	Unverified
StrategyQA	CoA w/o actions	EM	70.6	—	Unverified
StrategyQA	Least-to-Most	EM	65.8	—	Unverified
TruthfulQA	CoA w/o actions	EM	63.3	—	Unverified
TruthfulQA	CoA	EM	67.3	—	Unverified
WebQuestions	Self-Ask	EM	31.1	—	Unverified
WebQuestions	ToT	EM	26.3	—	Unverified
WebQuestions	Zero-shot	EM	43	—	Unverified
WebQuestions	Few-shot	EM	44.7	—	Unverified
WebQuestions	DSP	EM	59.4	—	Unverified
WebQuestions	CoA w/o actions	EM	64.7	—	Unverified
WebQuestions	CoT	EM	42.5	—	Unverified
WebQuestions	CoA	EM	70.7	—	Unverified
WebQuestions	React	EM	38.3	—	Unverified

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

Code

Abstract

Tasks

Benchmark Results

Reproductions