Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 7801–7825 of 10817 papers

Title	Date	Tasks	Status
Question Answering based Clinical Text Structuring Using Pre-trained Language Model	Aug 19, 2019	Language ModelingLanguage Modelling	—Unverified
Question-Answering Based Summarization of Electronic Health Records using Retrieval Augmented Generation	Jan 3, 2024	DiversityHallucination	—Unverified
IPFormer-VideoLLM: Enhancing Multi-modal Video Understanding for Multi-shot Scenes	Jun 26, 2025	AttributeQuestion Answering	—Unverified
Cultural Palette: Pluralising Culture Alignment via Multi-agent Palette	Dec 15, 2024	Large Language ModelQuestion Answering	—Unverified
Automated Discovery of Mathematical Definitions in Text with Deep Neural Networks	Nov 9, 2020	Binary ClassificationDefinition Extraction	—Unverified
Question Answering for Complex Electronic Health Records Database using Unified Encoder-Decoder Architecture	Nov 14, 2021	DecoderNatural Questions	—Unverified
iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering	Nov 16, 2020	Common Sense ReasoningDense Video Captioning	—Unverified
Automated Discovery of Mathematical Definitions in Text	May 1, 2020	ArticlesBinary Classification	—Unverified
An automatically discovered chain-of-thought prompt generalizes to novel models and datasets	May 4, 2023	Question Answering	—Unverified
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models	Feb 28, 2024	Image DescriptionQuestion Answering	—Unverified
Invited Talk: IBM Cognitive Computing - An NLP Renaissance!	Oct 1, 2014	Machine TranslationQuestion Answering	—Unverified
INVITED TALK 2: Towards Universal Syntactic Processing of Natural Language	Oct 1, 2014	Machine TranslationQuestion Answering	—Unverified
Investigating the use of Paraphrase Generation for Question Reformulation in the FRANK QA system	Jun 6, 2022	Paraphrase GenerationQuestion Answering	—Unverified
Question Answering in the Biomedical Domain	Jul 1, 2019	Question Answering	—Unverified
CUB: Benchmarking Context Utilisation Techniques for Language Models	May 22, 2025	BenchmarkingFact Checking	—Unverified
Question-Answering Model for Schizophrenia Symptoms and Their Impact on Daily Life using Mental Health Forums Data	Sep 30, 2023	Question Answering	—Unverified
Automated CVE Analysis: Harnessing Machine Learning In Designing Question-Answering Models For Cybersecurity Information Extraction	Dec 21, 2024	Question Answering	—Unverified
Investigating the Generative Approach for Question Answering in E-Commerce	May 1, 2022	Answer GenerationQuestion Answering	—Unverified
Question Answering on Linked Data: Challenges and Future Directions	Feb 16, 2016	Question Answering	—Unverified
Question Answering on Patient Medical Records with Private Fine-Tuned LLMs	Jan 23, 2025	Question Answering	—Unverified
Question Answering on Scholarly Knowledge Graphs	Jun 2, 2020	ArticlesKnowledge Base Question Answering	—Unverified
Question Answering Over Biological Knowledge Graph via Amazon Alexa	Oct 12, 2022	ArticlesData Integration	—Unverified
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning	Mar 27, 2025	Image GenerationObject	—Unverified
Investigating the Challenges of Temporal Relation Extraction from Clinical Text	Oct 1, 2018	Named Entity Recognition (NER)Question Answering	—Unverified
CTPs: Contextual Temporal Profiles for Time Scoping Facts using State Change Detection	Oct 1, 2014	Change DetectionQuestion Answering	—Unverified

Show:10 25 50

← PrevPage 313 of 433Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified