SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 19261950 of 10817 papers

TitleStatusHype
H-Mem: Harnessing synaptic plasticity with Hebbian Memory NetworksCode1
Just Ask: Learning to Answer Questions from Millions of Narrated VideosCode1
Point and Ask: Incorporating Pointing into Visual Question AnsweringCode1
Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip PredictionCode1
XTQA: Span-Level Explanations of the Textbook Question AnsweringCode1
Large Scale Multimodal Classification Using an Ensemble of Transformer Models and Co-AttentionCode1
LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question AnsweringCode1
EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP ApplicationsCode1
Learning Associative Inference Using Fast Weight MemoryCode1
Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge BasesCode1
NLPGym -- A toolkit for evaluating RL agents on Natural Language Processing TasksCode1
Utilizing Bidirectional Encoder Representations from Transformers for Answer SelectionCode1
VisBERT: Hidden-State Visualizations for TransformersCode1
Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question AnsweringCode1
Disentangling 3D Prototypical Networks For Few-Shot Concept LearningCode1
Context-Aware Answer Extraction in Question AnsweringCode1
EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question AnsweringCode1
CharBERT: Character-aware Pre-trained Language ModelCode1
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning StepsCode1
Learning to Contrast the Counterfactual Samples for Robust Visual Question AnsweringCode1
The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation ClassificationCode1
ConceptBert: Concept-Aware Representation for Visual Question AnsweringCode1
Question Answering with Long Multiple-Span AnswersCode1
CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question AnsweringCode1
Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement LearningCode1
Show:102550
← PrevPage 78 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified