Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4076–4100 of 10817 papers

Title	Date	Tasks	Status
Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance	Jun 26, 2020	Decision MakingQuestion Answering	—Unverified
Does the "most sinfully decadent cake ever" taste good? Answering Yes/No Questions from Figurative Contexts	Sep 24, 2023	Question Answering	—Unverified
Frustratingly Easy Model Ensemble for Abstractive Summarization	Oct 1, 2018	Abstractive Text SummarizationDensity Estimation	—Unverified
Better Early than Late: Fusing Topics with Word Embeddings for Neural Question Paraphrase Identification	Jul 22, 2020	Community Question AnsweringParaphrase Identification	—Unverified
Frustratingly Hard Evidence Retrieval for QA Over Books	Jul 20, 2020	Question AnsweringRetrieval	—Unverified
Does the Generator Mind its Contexts? An Analysis of Generative Model Faithfulness under Context Transfer	Feb 22, 2024	Generative Question AnsweringHallucination	—Unverified
Does Synthetic Data Generation of LLMs Help Clinical Text Mining?	Mar 8, 2023	Code Generationnamed-entity-recognition	—Unverified
Better Distractions: Transformer-based Distractor Generation and Multiple Choice Question Filtering	Oct 19, 2020	Distractor GenerationLanguage Modeling	—Unverified
Full-Time Supervision based Bidirectional RNN for Factoid Question Answering	Jun 19, 2016	Question Answering	—Unverified
CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising	Dec 14, 2021	Cross-Modal RetrievalDecoder	—Unverified
Asking Too Much? The Rhetorical Role of Questions in Political Discourse	Aug 7, 2017	Question Answering	—Unverified
Annotate and Identify Modalities, Speech Acts and Finer-Grained Event Types in Chinese Text	Aug 1, 2014	Machine TranslationQuestion Answering	—Unverified
Functorial Language Games for Question Answering	May 19, 2020	Question Answering	—Unverified
Does Similarity Matter? The Case of Answer Extraction from Technical Discussion Forums	Dec 1, 2012	Question AnsweringSentence Classification	—Unverified
FuRongWang at SemEval-2017 Task 3: Deep Neural Networks for Selecting Relevant Answers in Community Question Answering	Aug 1, 2017	Answer SelectionCommunity Question Answering	—Unverified
Furthest Reasoning with Plan Assessment: Stable Reasoning Path with Retrieval-Augmented Large Language Models	Sep 22, 2023	Multi-hop Question AnsweringQuestion Answering	—Unverified
Fuse and Adapt: Investigating the Use of Pre-Trained Self-Supervising Learning Models in Limited Data NLU problems	Dec 2, 2022	Domain AdaptationEmotion Recognition	—Unverified
Best Response Shaping	Apr 5, 2024	Deep Reinforcement LearningQuestion Answering	—Unverified
Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations	Aug 30, 2023	Explanation GenerationQuestion Answering	—Unverified
Does QA-based intermediate training help fine-tuning language models for text classification?	Dec 30, 2021	ClassificationQuestion Answering	—Unverified
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?	Jun 20, 2024	Caption GenerationHallucination	—Unverified
BESTMVQA: A Benchmark Evaluation System for Medical Visual Question Answering	Dec 13, 2023	Medical Visual Question AnsweringQuestion Answering	—Unverified
A Comprehensive Survey on Relation Extraction: Recent Advances and New Frontiers	Jun 3, 2023	Information RetrievalKnowledge Graph Completion	—Unverified
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations	Sep 28, 2024	Dataset GenerationInformativeness	—Unverified
Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!	Oct 13, 2020	DiagnosticImage-text Classification	—Unverified

Show:10 25 50

← PrevPage 164 of 433Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified