SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 21762200 of 10817 papers

TitleStatusHype
Neural Attentive Bag-of-Entities Model for Text ClassificationCode1
Interactive Language Learning by Question AnsweringCode1
VL-BERT: Pre-training of Generic Visual-Linguistic RepresentationsCode1
LXMERT: Learning Cross-Modality Encoder Representations from TransformersCode1
VideoNavQA: Bridging the Gap between Visual and Embodied Question AnsweringCode1
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language TasksCode1
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question AnsweringCode1
RoBERTa: A Robustly Optimized BERT Pretraining ApproachCode1
WinoGrande: An Adversarial Winograd Schema Challenge at ScaleCode1
ELI5: Long Form Question AnsweringCode1
XQA: A Cross-lingual Open-domain Question Answering DatasetCode1
XLNet: Generalized Autoregressive Pretraining for Language UnderstandingCode1
Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QACode1
Interconnected Question Generation with Coreference Alignment and Conversation Flow ModelingCode1
Latent Retrieval for Weakly Supervised Open Domain Question AnsweringCode1
Scene Text Visual Question AnsweringCode1
OK-VQA: A Visual Question Answering Benchmark Requiring External KnowledgeCode1
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No QuestionsCode1
Dynamically Fused Graph Network for Multi-hop ReasoningCode1
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain AdaptationCode1
Large Batch Optimization for Deep Learning: Training BERT in 76 minutesCode1
Analyzing Knowledge Graph Embedding Methods from a Multi-Embedding Interaction PerspectiveCode1
Bidirectional Attentive Memory Networks for Question Answering over Knowledge BasesCode1
Lattice CNNs for Matching Based Chinese Question AnsweringCode1
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question AnsweringCode1
Show:102550
← PrevPage 88 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified