SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 19512000 of 10817 papers

TitleStatusHype
Less is More: Data-Efficient Complex Question Answering over Knowledge BasesCode1
Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learningCode1
RussianSuperGLUE: A Russian Language Understanding Evaluation BenchmarkCode1
MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question AnsweringCode1
Measuring Association Between Labels and Free-Text RationalesCode1
Learning Contextualized Knowledge Structures for Commonsense ReasoningCode1
RUArt: A Novel Text-Centered Solution for Text-Based Visual Question AnsweringCode1
AQuaMuSe: Automatically Generating Datasets for Query-Based Multi-Document SummarizationCode1
Answering Open-Domain Questions of Varying Reasoning Steps from TextCode1
Unsupervised Multi-hop Question Answering by Question GenerationCode1
mT5: A massively multilingual pre-trained text-to-text transformerCode1
XOR QA: Cross-lingual Open-Retrieval Question AnsweringCode1
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional EntropiesCode1
RECONSIDER: Re-Ranking using Span-Focused Cross-Attention for Open Domain Question AnsweringCode1
Exploring Sequence-to-Sequence Models for SPARQL Pattern CompositionCode1
Bayesian Attention ModulesCode1
Open Question Answering over Tables and TextCode1
Knowledge Graph-based Question Answering with Electronic Health RecordsCode1
The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation ClassificationCode1
Knowledge-guided Open Attribute Value Extraction with Reinforcement LearningCode1
Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question AnsweringCode1
Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense GraphsCode1
Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with SearchCode1
CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation TransferringCode1
Contrast and Classify: Training Robust VQA ModelsCode1
Cross-Modal BERT for Text-Audio Sentiment AnalysisCode1
Counterfactual Variable Control for Robust and Interpretable Question AnsweringCode1
Open-Domain Question Answering Goes Conversational via Question RewritingCode1
AutoQA: From Databases To QA Semantic Parsers With Only Synthetic Training DataCode1
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name RecognitionCode1
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension MetricsCode1
Exposing Shallow Heuristics of Relation Extraction Models with Challenge DataCode1
Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-AnsweringCode1
SRLGRN: Semantic Role Labeling Graph Reasoning NetworkCode1
Cross-Thought for Sentence Encoder Pre-trainingCode1
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a StartCode1
PolicyQA: A Reading Comprehension Dataset for Privacy PoliciesCode1
UnQovering Stereotyping Biases via Underspecified QuestionsCode1
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model AdaptationCode1
InfoBERT: Improving Robustness of Language Models from An Information Theoretic PerspectiveCode1
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attentionCode1
Autoregressive Entity RetrievalCode1
MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive ScaleCode1
Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a SummaryCode1
Interpreting Graph Neural Networks for NLP With Differentiable Edge MaskingCode1
Sequence-to-Sequence Learning for Indonesian Automatic Question GeneratorCode1
SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching RetrievalCode1
Multi-Relational Embedding for Knowledge Graph Representation and AnalysisCode1
Answering Complex Open-Domain Questions with Multi-Hop Dense RetrievalCode1
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal TransformersCode1
Show:102550
← PrevPage 40 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified