SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 19762000 of 10817 papers

TitleStatusHype
Cross-Modal BERT for Text-Audio Sentiment AnalysisCode1
Counterfactual Variable Control for Robust and Interpretable Question AnsweringCode1
Open-Domain Question Answering Goes Conversational via Question RewritingCode1
AutoQA: From Databases To QA Semantic Parsers With Only Synthetic Training DataCode1
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name RecognitionCode1
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension MetricsCode1
Exposing Shallow Heuristics of Relation Extraction Models with Challenge DataCode1
Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multihop Question-AnsweringCode1
SRLGRN: Semantic Role Labeling Graph Reasoning NetworkCode1
Cross-Thought for Sentence Encoder Pre-trainingCode1
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a StartCode1
PolicyQA: A Reading Comprehension Dataset for Privacy PoliciesCode1
UnQovering Stereotyping Biases via Underspecified QuestionsCode1
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model AdaptationCode1
InfoBERT: Improving Robustness of Language Models from An Information Theoretic PerspectiveCode1
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attentionCode1
Autoregressive Entity RetrievalCode1
MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive ScaleCode1
Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a SummaryCode1
Interpreting Graph Neural Networks for NLP With Differentiable Edge MaskingCode1
Sequence-to-Sequence Learning for Indonesian Automatic Question GeneratorCode1
SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching RetrievalCode1
Multi-Relational Embedding for Knowledge Graph Representation and AnalysisCode1
Answering Complex Open-Domain Questions with Multi-Hop Dense RetrievalCode1
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal TransformersCode1
Show:102550
← PrevPage 80 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified