Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1851–1900 of 10817 papers

Title	Date	Tasks	Status	Hype
Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents	Mar 6, 2022	Community Question AnsweringInformation Retrieval	CodeCode Available	1
Change Detection Meets Visual Question Answering	Dec 12, 2021	Answer GenerationChange Detection	CodeCode Available	1
Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings	Sep 15, 2023	Question Answering	CodeCode Available	1
SATORI-R1: Incentivizing Multimodal Reasoning with Spatial Grounding and Verifiable Rewards	May 25, 2025	Image CaptioningMultimodal Reasoning	CodeCode Available	1
Direct Evaluation of Chain-of-Thought in Multi-hop Reasoning with Knowledge Graphs	Feb 17, 2024	Knowledge GraphsMulti-hop Question Answering	CodeCode Available	1
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies	Jan 6, 2021	Question AnsweringStrategyQA	CodeCode Available	1
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers	May 7, 2021	Evidence SelectionQuestion Answering	CodeCode Available	1
Differentiable Reasoning on Large Knowledge Bases and Natural Language	Dec 17, 2019	Link PredictionQuestion Answering	CodeCode Available	1
Why So Gullible? Enhancing the Robustness of Retrieval-Augmented Models against Counterfactual Noise	May 2, 2023	counterfactualFew-Shot Learning	CodeCode Available	1
SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations	Apr 27, 2020	Question AnsweringSentence	CodeCode Available	1
DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization	Sep 6, 2021	abstractive question answeringDenoising	CodeCode Available	1
Schema2QA: High-Quality and Low-Cost Q&A Agents for the Structured Web	Jan 16, 2020	Question AnsweringSemantic Parsing	CodeCode Available	1
Dialog Inpainting: Turning Documents into Dialogs	May 18, 2022	Conversational Question AnsweringQuestion Answering	CodeCode Available	1
DeVLBert: Learning Deconfounded Visio-Linguistic Representations	Aug 16, 2020	Image RetrievalQuestion Answering	CodeCode Available	1
DEXTER: A Benchmark for open-domain Complex Question Answering using LLMs	Jun 24, 2024	Question AnsweringRetrieval	CodeCode Available	1
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types	Dec 16, 2024	Question Answering	CodeCode Available	1
DialSim: A Real-Time Simulator for Evaluating Long-Term Multi-Party Dialogue Understanding of Conversational Agents	Jun 19, 2024	Dialogue UnderstandingQuestion Answering	CodeCode Available	1
CharBERT: Character-aware Pre-trained Language Model	Nov 3, 2020	Language ModelingLanguage Modelling	CodeCode Available	1
SCREWS: A Modular Framework for Reasoning with Revisions	Sep 20, 2023	Multi-hop Question AnsweringQuestion Answering	CodeCode Available	1
SCROLLS: Standardized CompaRison Over Long Language Sequences	Jan 10, 2022	DecoderLong-range modeling	CodeCode Available	1
DocNLI: A Large-scale Dataset for Document-level Natural Language Inference	Jun 17, 2021	Natural Language InferenceQuestion Answering	CodeCode Available	1
Context-Aware Answer Extraction in Question Answering	Nov 5, 2020	Multi-Task LearningPrediction	CodeCode Available	1
Dynamic Language Binding in Relational Visual Reasoning	Apr 30, 2020	ObjectQuestion Answering	CodeCode Available	1
Engineering flexible machine learning systems by traversing functionally-invariant paths	Apr 30, 2022	Adversarial RobustnessContinual Learning	CodeCode Available	1
Describe Anything Model for Visual Question Answering on Text-rich Images	Jul 16, 2025	DescriptiveLanguage Modeling	CodeCode Available	1
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning	Nov 24, 2022	cross-modal alignmentImage-text Retrieval	CodeCode Available	1
SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning	Jan 24, 2024	Question Answeringreinforcement-learning	CodeCode Available	1
Dense Passage Retrieval for Open-Domain Question Answering	Apr 10, 2020	Open-Domain Question AnsweringPassage Retrieval	CodeCode Available	1
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering	Apr 15, 2021	Open-Domain Question AnsweringQuestion Answering	CodeCode Available	1
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering	Apr 7, 2025	Chart Question AnsweringChart Understanding	CodeCode Available	1
Self-Chained Image-Language Model for Video Localization and Question Answering	May 11, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning	Aug 1, 2023	GSM8KMath	CodeCode Available	1
Dense Hierarchical Retrieval for Open-Domain Question Answering	Oct 28, 2021	Open-Domain Question AnsweringQuestion Answering	CodeCode Available	1
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA	May 13, 2020	Image CaptioningMulti-Label Classification	CodeCode Available	1
Are Vision Language Models Ready for Clinical Diagnosis? A 3D Medical Benchmark for Tumor-centric Visual Question Answering	May 25, 2025	AnatomyBenchmarking	CodeCode Available	1
Densely Connected Attention Propagation for Reading Comprehension	Nov 10, 2018	AllOpen-Domain Question Answering	CodeCode Available	1
Can We Talk Models Into Seeing the World Differently?	Mar 14, 2024	Image CaptioningImage Classification	CodeCode Available	1
Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering	Jul 19, 2020	Adversarial AttackData Augmentation	CodeCode Available	1
DELIFT: Data Efficient Language model Instruction Fine Tuning	Nov 7, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
ChatGPT: Jack of all trades, master of none	Feb 21, 2023	AllChatbot	CodeCode Available	1
Context Awareness Gate For Retrieval Augmented Generation	Nov 25, 2024	Open-Domain Question AnsweringQuestion Answering	CodeCode Available	1
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation	Jul 31, 2017	Machine TranslationQuestion Answering	CodeCode Available	1
Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering	Oct 16, 2020	Information RetrievalManagement	CodeCode Available	1
DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering	May 2, 2020	Question Answering	CodeCode Available	1
Sequence tagging for biomedical extractive question answering	Apr 15, 2021	Extractive Question-AnsweringQuestion Answering	CodeCode Available	1
Sequence-to-Sequence Knowledge Graph Completion and Question Answering	Mar 19, 2022	DecoderGraph Embedding	CodeCode Available	1
Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents	Jan 14, 2022	Open Information ExtractionQuestion Answering	CodeCode Available	1
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model	Feb 20, 2025	Mixture-of-ExpertsQuestion Answering	CodeCode Available	1
A Dataset for Statutory Reasoning in Tax Law Entailment and Question Answering	May 11, 2020	Natural Language UnderstandingQuestion Answering	CodeCode Available	1
DegreEmbed: incorporating entity embedding into logic rule learning for knowledge graph reasoning	Dec 18, 2021	Knowledge GraphsLink Prediction	CodeCode Available	1

Show:10 25 50

← PrevPage 38 of 217Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified