SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 21512200 of 10817 papers

TitleStatusHype
Asking Questions the Human Way: Scalable Question-Answer Generation from Text CorpusCode1
Retrospective Reader for Machine Reading ComprehensionCode1
ManyModalQA: Modality Disambiguation and QA over Diverse InputsCode1
Schema2QA: High-Quality and Low-Cost Q&A Agents for the Structured WebCode1
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual FeaturesCode1
In Defense of Grid Features for Visual Question AnsweringCode1
Side-Tuning: A Baseline for Network Adaptation via Additive Side NetworksCode1
T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted AttackCode1
Differentiable Reasoning on Large Knowledge Bases and Natural LanguageCode1
PIQA: Reasoning about Physical Commonsense in Natural LanguageCode1
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question AnsweringCode1
Inductive Relation Prediction by Subgraph ReasoningCode1
Knowledge Guided Text Retrieval and Reading for Open Domain Question AnsweringCode1
Multi-domain Dialogue State Tracking as Dynamic Knowledge Graph Enhanced Question AnsweringCode1
Contextualized Sparse Representations for Real-Time Open-Domain Question AnsweringCode1
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and ComprehensionCode1
MLQA: Evaluating Cross-lingual Extractive Question AnsweringCode1
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighterCode1
Overcoming Data Limitation in Medical Visual Question AnsweringCode1
Reducing Transformer Depth on Demand with Structured DropoutCode1
UNITER: UNiversal Image-TExt Representation LearningCode1
Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding SpaceCode1
PubMedQA: A Dataset for Biomedical Research Question AnsweringCode1
How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer RepresentationsCode1
Don't Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset BiasesCode1
Neural Attentive Bag-of-Entities Model for Text ClassificationCode1
Interactive Language Learning by Question AnsweringCode1
VL-BERT: Pre-training of Generic Visual-Linguistic RepresentationsCode1
LXMERT: Learning Cross-Modality Encoder Representations from TransformersCode1
VideoNavQA: Bridging the Gap between Visual and Embodied Question AnsweringCode1
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language TasksCode1
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question AnsweringCode1
RoBERTa: A Robustly Optimized BERT Pretraining ApproachCode1
WinoGrande: An Adversarial Winograd Schema Challenge at ScaleCode1
ELI5: Long Form Question AnsweringCode1
XQA: A Cross-lingual Open-domain Question Answering DatasetCode1
XLNet: Generalized Autoregressive Pretraining for Language UnderstandingCode1
Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QACode1
Interconnected Question Generation with Coreference Alignment and Conversation Flow ModelingCode1
Latent Retrieval for Weakly Supervised Open Domain Question AnsweringCode1
Scene Text Visual Question AnsweringCode1
OK-VQA: A Visual Question Answering Benchmark Requiring External KnowledgeCode1
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No QuestionsCode1
Dynamically Fused Graph Network for Multi-hop ReasoningCode1
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain AdaptationCode1
Large Batch Optimization for Deep Learning: Training BERT in 76 minutesCode1
Analyzing Knowledge Graph Embedding Methods from a Multi-Embedding Interaction PerspectiveCode1
Bidirectional Attentive Memory Networks for Question Answering over Knowledge BasesCode1
Lattice CNNs for Matching Based Chinese Question AnsweringCode1
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question AnsweringCode1
Show:102550
← PrevPage 44 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified