SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 65016525 of 10817 papers

TitleStatusHype
Multi-Scale Progressive Attention Network for Video Question Answering0
DESCGEN: A Distantly Supervised Datasetfor Generating Entity DescriptionsCode0
xMoCo: Cross Momentum Contrastive Learning for Open-Domain Question Answering0
In Factuality: Efficient Integration of Relevant Facts for Visual Question Answering0
Recognizing Multimodal Entailment0
COSY: COunterfactual SYntax for Cross-Lingual UnderstandingCode0
Addressing Semantic Drift in Generative Question Answering with Auxiliary Extraction0
Continuous Language Generative FlowCode1
BiQuAD: Towards QA based on deeper text understanding0
LIORI at SemEval-2021 Task 2: Span Prediction and Binary Classification approaches to Word-in-Context Disambiguation0
UoR at SemEval-2021 Task 4: Using Pre-trained BERT Token Embeddings for Question Answering of Abstract Meaning0
Attention-based Aspect Reasoning for Knowledge Base Question Answering on Clinical Notes0
Automatic Claim Review for Climate Science via Explanation Generation0
An Online Question Answering System based on Sub-graph Searching0
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension0
Greedy Gradient Ensemble for Robust Visual Question AnsweringCode1
One Question Answering Model for Many Languages with Cross-lingual Dense Passage RetrievalCode1
Thought Flow Nets: From Single Predictions to Trains of Model Thought0
Hybrid Autoregressive Inference for Scalable Multi-hop Explanation RegenerationCode0
X-GGM: Graph Generative Modeling for Out-of-Distribution Generalization in Visual Question AnsweringCode0
The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding0
CogME: A Cognition-Inspired Multi-Dimensional Evaluation Metric for Story Understanding0
Separating Skills and Concepts for Novel Visual Question AnsweringCode1
Bridging the Gap between Language Model and Reading Comprehension: Unsupervised MRC via Self-Supervision0
A Discriminative Semantic Ranker for Question Retrieval0
Show:102550
← PrevPage 261 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified