Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 9951–9975 of 10817 papers

Title	Date	Tasks	Status
CMQA: A Dataset of Conditional Question Answering with Multiple-Span Answers	Oct 1, 2022	Question Answering	CodeCode Available
C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References	Mar 16, 2022	Open-Domain Question AnsweringQuestion Answering	CodeCode Available
R^3: Reinforced Reader-Ranker for Open-Domain Question Answering	Aug 31, 2017	Answer GenerationInformation Retrieval	CodeCode Available
Evaluating Semantic Parsing against a Simple Web-based Question Answering Model	Jul 14, 2017	Question AnsweringSemantic Parsing	CodeCode Available
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models	Jul 9, 2024	coreference-resolutionCoreference Resolution	CodeCode Available
PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models	Feb 19, 2025	Open-Ended Question AnsweringPrivacy Preserving	CodeCode Available
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering	May 5, 2020	Extractive Question-AnsweringQuestion Answering	CodeCode Available
Listen Then See: Video Alignment with Speaker Attention	Apr 21, 2024	cross-modal alignmentQuestion Answering	CodeCode Available
CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems	Jun 2, 2024	Question Answering	CodeCode Available
Relation Extraction : A Survey	Dec 14, 2017	ArticlesInformation Retrieval	CodeCode Available
NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager	Aug 1, 2018	Goal-Oriented DialogGoal-Oriented Dialogue Systems	CodeCode Available
Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph	May 22, 2023	General KnowledgeQuestion Answering	CodeCode Available
NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework	Dec 1, 2022	Information RetrievalOpen-Domain Question Answering	CodeCode Available
NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering	Feb 15, 2025	ChunkingInformation Retrieval	CodeCode Available
Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression	May 1, 2024	Language ModelingLanguage Modelling	CodeCode Available
Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models	May 8, 2025	Active Learningcross-modal alignment	CodeCode Available
LittleMu: Deploying an Online Virtual Teaching Assistant via Heterogeneous Sources Integration and Chain of Teach Prompts	Aug 11, 2023	Language ModellingQuestion Answering	CodeCode Available
LiveQA: A Question Answering Dataset over Sports Live	Oct 1, 2020	Multiple-choiceQuestion Answering	CodeCode Available
NLEBench+NorGLM: A Comprehensive Empirical Analysis and Benchmark Dataset for Generative Language Models in Norwegian	Dec 3, 2023	Natural Language UnderstandingQuestion Answering	CodeCode Available
Evaluating Natural Language Understanding Services for Conversational Question Answering Systems	Aug 1, 2017	ChatbotConversational Question Answering	CodeCode Available
CluMo: Cluster-based Modality Fusion Prompt for Continual Learning in Visual Question Answering	Aug 21, 2024	Continual LearningQuestion Answering	CodeCode Available
AdCare-VLM: Leveraging Large Vision Language Model (LVLM) to Monitor Long-Term Medication Adherence and Care	May 1, 2025	Language ModelingLanguage Modelling	CodeCode Available
Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?	Oct 20, 2024	Question Answeringvalid	CodeCode Available
Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering	May 22, 2025	Global FactsLanguage Modeling	CodeCode Available
Evaluating Large Language Models in Semantic Parsing for Conversational Question Answering over Knowledge Graphs	Jan 3, 2024	Conversational Question AnsweringInformation Retrieval	CodeCode Available

Show:10 25 50

← PrevPage 399 of 433Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified