Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2501–2510 of 10817 papers

Title	Date	Tasks	Status
ChartCitor: Multi-Agent Framework for Fine-Grained Chart Visual Attribution	Feb 3, 2025	Chart Question AnsweringQuestion Answering	—Unverified
Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs	May 22, 2025	Question Answering	—Unverified
A Restricted Visual Turing Test for Deep Scene and Event Understanding	Dec 6, 2015	Question AnsweringVideo Captioning	—Unverified
Character Sequence-to-Sequence Model with Global Attention for Universal Morphological Reinflection	Aug 1, 2017	Machine TranslationQuestion Answering	—Unverified
Don't Read Too Much into It: Adaptive Computation for Open-Domain Question Answering	Nov 10, 2020	Open-Domain Question AnsweringQuestion Answering	—Unverified
Character Matters: Video Story Understanding with Character-Aware Relations	May 9, 2020	Question Answering	—Unverified
ARES: A Reading Comprehension Ensembling Service	Oct 1, 2020	Machine Reading ComprehensionNatural Questions	—Unverified
Al-Bayan: An Arabic Question Answering System for the Holy Quran	Oct 1, 2014	Morphological AnalysisQuestion Answering	—Unverified
Characterizing Video Question Answering with Sparsified Inputs	Nov 27, 2023	Question AnsweringVideo Question Answering	—Unverified
Are Sample-Efficient NLP Models More Robust?	Oct 12, 2022	Extractive Question-Answeringimage-classification	—Unverified

Show:10 25 50

← PrevPage 251 of 1082Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified