Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2891–2900 of 10817 papers

Title	Date	Tasks	Status
Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding	Oct 2, 2024	coreference-resolutionCoreference Resolution	—Unverified
Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval	Aug 9, 2023	ARCLanguage Modelling	—Unverified
A Glimpse in ChatGPT Capabilities and its impact for AI research	May 10, 2023	Question AnsweringText Generation	—Unverified
Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering	Apr 29, 2021	Question AnsweringVideo Question Answering	—Unverified
Answering Unanswered Questions through Semantic Reformulations in Spoken QA	May 27, 2023	Question AnsweringSpecificity	—Unverified
A criterion for Artificial General Intelligence: hypothetic-deductive reasoning, tested on ChatGPT	Aug 5, 2023	ChatbotQuestion Answering	—Unverified
Enhancing Multi-Image Question Answering via Submodular Subset Selection	May 15, 2025	Question AnsweringRetrieval	—Unverified
Bridge the Gap between Language models and Tabular Understanding	Feb 16, 2023	Contrastive LearningLanguage Modeling	—Unverified
Bridge Damage Cause Estimation Using Multiple Images Based on Visual Question Answering	Feb 18, 2023	Question AnsweringVisual Question Answering	—Unverified
Answering Science Exam Questions Using Query Reformulation with Background Knowledge	Nov 17, 2018	ARCInformation Retrieval	—Unverified

Show:10 25 50

← PrevPage 290 of 1082Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified