SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 76517675 of 10817 papers

TitleStatusHype
A French Corpus for Semantic Similarity0
WorldTree V2: A Corpus of Science-Domain Structured Explanations and Inference Patterns supporting Multi-Hop Inference0
Automatic Spanish Translation of SQuAD Dataset for Multi-lingual Question Answering0
An Empirical Comparison of Question Classification Methods for Question Answering Systems0
Assessing Users' Reputation from Syntactic and Semantic Information in Community Question Answering0
Generating Responses that Reflect Meta Information in User-Generated Question Answer Pairs0
The Margarita Dialogue Corpus: A Data Set for Time-Offset Interactions and Unstructured Dialogue Systems0
MTSI-BERT: A Session-aware Knowledge-based Conversational AgentCode1
Do not let the history haunt you: Mitigating Compounding Errors in Conversational Question Answering0
LifeQA: A Real-life Dataset for Video Question AnsweringCode1
TED-Q: TED Talks and the Questions they Evoke0
Automated Discovery of Mathematical Definitions in Text0
SiBert: Enhanced Chinese Pre-trained Language Model with Sentence InsertionCode1
Image Position Prediction in Multimodal Documents0
Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque0
FRAQUE: a FRAme-based QUEstion-answering system for the Public Administration domain0
``A Passage to India'': Pre-trained Word Embeddings for Indian Languages0
Clinical Reading Comprehension: A Thorough Analysis of the emrQA DatasetCode1
KLEJ: Comprehensive Benchmark for Polish Language UnderstandingCode1
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question AnsweringCode1
Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering0
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions0
Visuo-Linguistic Question Answering (VLQA) ChallengeCode0
KPQA: A Metric for Generative Question Answering Using Keyphrase WeightsCode1
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingCode1
Show:102550
← PrevPage 307 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified