Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6301–6325 of 10817 papers

Title	Date	Tasks	Status
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering	Apr 9, 2024	EgoSchemaMultiple-choice	—Unverified
How Susceptible are LLMs to Influence in Prompts?	Aug 17, 2024	Multiple-choiceQuestion Answering	—Unverified
How State-Of-The-Art Models Can Deal With Long-Form Question Answering	Oct 1, 2020	FormLong Form Question Answering	—Unverified
Continual Domain Adaptation for Machine Reading Comprehension	Aug 25, 2020	Continual LearningDomain Adaptation	—Unverified
How Stable is Knowledge Base Knowledge?	Nov 2, 2022	Question Answering	—Unverified
How Self-Attention Improves Rare Class Performance in a Question-Answering Dialogue Agent	Jul 1, 2020	Language ModelingLanguage Modelling	—Unverified
EfficientQA : a RoBERTa Based Phrase-Indexed Question-Answering System	Jan 6, 2021	Extractive Question-AnsweringGPU	—Unverified
A Multithreaded Conversational Interface for Pedestrian Navigation and Question Answering	Aug 1, 2013	Question AnsweringSpoken Dialogue Systems	—Unverified
Accounting for Sycophancy in Language Model Uncertainty Estimation	Oct 17, 2024	Language ModelingLanguage Modelling	—Unverified
Morphological Analysis for Unsegmented Languages using Recurrent Neural Network Language Model	Sep 1, 2015	Language ModelingLanguage Modelling	—Unverified
Contingency and Comparison Relation Labeling and Structure Prediction in Chinese Sentences	Jul 1, 2012	Opinion MiningQuestion Answering	—Unverified
Mining Compatible/Incompatible Entities from Question and Answering via Yes/No Answer Classification using Distant Label Expansion	Dec 14, 2016	Community Question AnsweringGeneral Classification	—Unverified
How Relevant is Selective Memory Population in Lifelong Language Learning?	Oct 3, 2022	Lifelong learningQuestion Answering	—Unverified
Mining Duplicate Questions of Stack Overflow	Oct 4, 2022	Community Question AnsweringQuestion Answering	—Unverified
Mining Fine-grained Opinion Expressions with Shallow Parsing	Sep 1, 2013	Fine-Grained Opinion AnalysisOpinion Mining	—Unverified
Mining Implicit Relevance Feedback from User Behavior for Web Question Answering	Jun 13, 2020	Passage RankingQuestion Answering	—Unverified
Mining Information from Event Structure Relation Graph for Event Argument Extraction	Jan 16, 2022	Event Argument ExtractionEvent Extraction	—Unverified
Mining Interpretable AOG Representations from Convolutional Networks via Active Question Answering	Dec 18, 2018	ObjectQuestion Answering	—Unverified
A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles	Nov 4, 2024	Question AnsweringStory Generation	—Unverified
How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review	Sep 4, 2024	Question AnsweringText Generation	—Unverified
A survey on phrase structure learning methods for text classification	Jun 21, 2014	ClassificationGeneral Classification	—Unverified
Mining Paraphrasal Typed Templates from a Plain Text Corpus	Aug 1, 2016	Question AnsweringText Generation	—Unverified
Challenges in Explanation Quality Evaluation	Oct 13, 2022	Question Answering	—Unverified
Mining Shape of Expertise: A Novel Approach Based on Convolutional Neural Network	Apr 5, 2020	Community Question AnsweringQuestion Answering	—Unverified
How much should you ask? On the question structure in QA systems	Sep 11, 2018	Question Answeringvalid	—Unverified

Show:10 25 50

← PrevPage 253 of 433Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified