SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 78017850 of 10817 papers

TitleStatusHype
English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too0
History-Aware Question Answering in a Blocks World Dialogue System0
Generating Semantically Valid Adversarial Questions for TableQA0
An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering0
A Complex KBQA System using Multiple Reasoning Paths0
Comparative Study of Machine Learning Models and BERT on SQuADCode0
Functorial Language Games for Question Answering0
On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law0
Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks0
Support-BERT: Predicting Quality of Question-Answer Pairs in MSDN using Deep Bidirectional Transformer0
CS-NLP team at SemEval-2020 Task 4: Evaluation of State-of-the-art NLP Deep Learning Architectures on Commonsense Reasoning Task0
Context-Based Quotation Recommendation0
Visual Relationship Detection using Scene Graphs: A Survey0
An Evaluation of Recent Neural Sequence Tagging Models in Turkish Named Entity Recognition0
Do not let the history haunt you -- Mitigating Compounding Errors in Conversational Question Answering0
Maximizing Information Gain in Partially Observable Environments via Prediction Reward0
How Context Affects Language Models' Factual Predictions0
Character Matters: Video Story Understanding with Character-Aware Relations0
DramaQA: Character-Centered Video Story Understanding with Hierarchical QACode0
Where is Linked Data in Question Answering over Linked Data?0
CounQER: A System for Discovering and Linking Count Information in Knowledge BasesCode0
A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM0
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question AnsweringCode0
Visual Question Answering with Prior Class Semantics0
DoQA -- Accessing Domain-Specific FAQs via Conversational QA0
ForecastQA: A Question Answering Challenge for Event Forecasting with Temporal Text Data0
Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected ReasoningCode0
AVA: an Automatic eValuation Approach to Question Answering Systems0
FRAQUE: a FRAme-based QUEstion-answering system for the Public Administration domain0
Chat or Learn: a Data-Driven Robust Question-Answering System0
The Margarita Dialogue Corpus: A Data Set for Time-Offset Interactions and Unstructured Dialogue Systems0
Evaluation of Dataset Selection for Pre-Training and Fine-Tuning Transformer Language Models for Clinical Question Answering0
AIA-BDE: A Corpus of FAQs in Portuguese and their Variations0
NeurQuRI: Neural Question Requirement Inspector for Answerability Prediction in Machine Reading Comprehension0
Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension0
Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering0
``A Passage to India'': Pre-trained Word Embeddings for Indian Languages0
ScholarlyRead: A New Dataset for Scientific Article Reading Comprehension0
A Corpus for Visual Question Answering Annotated with Frame Semantic Information0
TED-Q: TED Talks and the Questions they Evoke0
A French Corpus for Semantic Similarity0
Do not let the history haunt you: Mitigating Compounding Errors in Conversational Question Answering0
WorldTree V2: A Corpus of Science-Domain Structured Explanations and Inference Patterns supporting Multi-Hop Inference0
Visuo-Linguistic Question Answering (VLQA) ChallengeCode0
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions0
Automatic Spanish Translation of SQuAD Dataset for Multi-lingual Question Answering0
An Empirical Comparison of Question Classification Methods for Question Answering Systems0
Automated Discovery of Mathematical Definitions in Text0
Image Position Prediction in Multimodal Documents0
Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque0
Show:102550
← PrevPage 157 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified