SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 78017825 of 10817 papers

TitleStatusHype
English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too0
History-Aware Question Answering in a Blocks World Dialogue System0
Generating Semantically Valid Adversarial Questions for TableQA0
An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering0
A Complex KBQA System using Multiple Reasoning Paths0
Comparative Study of Machine Learning Models and BERT on SQuADCode0
Functorial Language Games for Question Answering0
On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law0
Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks0
Support-BERT: Predicting Quality of Question-Answer Pairs in MSDN using Deep Bidirectional Transformer0
CS-NLP team at SemEval-2020 Task 4: Evaluation of State-of-the-art NLP Deep Learning Architectures on Commonsense Reasoning Task0
Context-Based Quotation Recommendation0
Visual Relationship Detection using Scene Graphs: A Survey0
An Evaluation of Recent Neural Sequence Tagging Models in Turkish Named Entity Recognition0
Do not let the history haunt you -- Mitigating Compounding Errors in Conversational Question Answering0
Maximizing Information Gain in Partially Observable Environments via Prediction Reward0
How Context Affects Language Models' Factual Predictions0
Character Matters: Video Story Understanding with Character-Aware Relations0
DramaQA: Character-Centered Video Story Understanding with Hierarchical QACode0
Where is Linked Data in Question Answering over Linked Data?0
CounQER: A System for Discovering and Linking Count Information in Knowledge BasesCode0
A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM0
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question AnsweringCode0
DoQA -- Accessing Domain-Specific FAQs via Conversational QA0
Visual Question Answering with Prior Class Semantics0
Show:102550
← PrevPage 313 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified