SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 67516775 of 10817 papers

TitleStatusHype
Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction0
Sequence-to-Sequence Knowledge Graph Completion and Question Answering0
Cross-Task Generalization via Natural Language Crowdsourcing Instructions0
CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions0
CQARE: Contrastive Question-Answering for Few-shot Relation Extraction with Prompt Tuning0
Co-VQA : Answering by Interactive Sub Question Sequence0
QA4PRF: A Question Answering based Framework for Pseudo Relevance Feedback0
Hyperlink-induced Pre-training for Passage Retrieval of Open-domain Question Answering0
Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment0
Context-Paraphrase Enhanced Commonsense Question Answering0
Training Data is More Valuable than You Think: A Simple and Effective Method by Retrieving from Training Data0
Probing Difficulty and Discrimination of Natural Language Questions With Item Response Theory0
XLTime: A Cross-Lingual Knowledge Transfer Framework for Zero-Shot Low-Resource Language Temporal Expression Extraction0
Get Your Model Puzzled: Introducing Crossword-Solving as a New NLP Benchmark0
GenRE: A Generative Model for Relation Extraction0
Ask Me Anything in Your Native Language0
Calculating Question Similarity is Enough: A New Method for KBQA Tasks0
Incorporating Question Answering-Based Signals into Abstractive Summarization via Salient Span Selection0
Question Answering for Complex Electronic Health Records Database using Unified Encoder-Decoder Architecture0
A Chinese Multi-type Complex Questions Answering Dataset over Wikidata0
Graph Relation Transformer: Incorporating pairwise object features into the Transformer architecture0
Prune Once for All: Sparse Pre-Trained Language Models0
Pre-trained Transformer-Based Approach for Arabic Question Answering : A Comparative Study0
Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding0
A Two-Stage Approach towards Generalization in Knowledge Base Question Answering0
Show:102550
← PrevPage 271 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified