SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 78267850 of 10817 papers

TitleStatusHype
ForecastQA: A Question Answering Challenge for Event Forecasting with Temporal Text Data0
Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected ReasoningCode0
AVA: an Automatic eValuation Approach to Question Answering Systems0
FRAQUE: a FRAme-based QUEstion-answering system for the Public Administration domain0
Chat or Learn: a Data-Driven Robust Question-Answering System0
The Margarita Dialogue Corpus: A Data Set for Time-Offset Interactions and Unstructured Dialogue Systems0
Evaluation of Dataset Selection for Pre-Training and Fine-Tuning Transformer Language Models for Clinical Question Answering0
AIA-BDE: A Corpus of FAQs in Portuguese and their Variations0
NeurQuRI: Neural Question Requirement Inspector for Answerability Prediction in Machine Reading Comprehension0
Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension0
Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering0
``A Passage to India'': Pre-trained Word Embeddings for Indian Languages0
ScholarlyRead: A New Dataset for Scientific Article Reading Comprehension0
A Corpus for Visual Question Answering Annotated with Frame Semantic Information0
TED-Q: TED Talks and the Questions they Evoke0
A French Corpus for Semantic Similarity0
Do not let the history haunt you: Mitigating Compounding Errors in Conversational Question Answering0
WorldTree V2: A Corpus of Science-Domain Structured Explanations and Inference Patterns supporting Multi-Hop Inference0
Visuo-Linguistic Question Answering (VLQA) ChallengeCode0
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions0
Automatic Spanish Translation of SQuAD Dataset for Multi-lingual Question Answering0
An Empirical Comparison of Question Classification Methods for Question Answering Systems0
Automated Discovery of Mathematical Definitions in Text0
Image Position Prediction in Multimodal Documents0
Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque0
Show:102550
← PrevPage 314 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified