SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 26762700 of 10817 papers

TitleStatusHype
Can SAR improve RSVQA performance?0
Efficient crowdsourcing of crowd-generated microtasks0
A Procedural Definition of Multi-word Lexical Units0
Can Question Generation Debias Question Answering Models? A Case Study on Question–Context Lexical Overlap0
AiFu at SemEval-2019 Task 10: A Symbolic and Sub-symbolic Integrated System for SAT Math Question Answering0
Efficient Deployment of Conversational Natural Language Interfaces over Databases0
Efficient Few-Shot Continual Learning in Vision-Language Models0
A Probabilistic Model for Joint Learning of Word Embeddings from Texts and Images0
Can Pre-training help VQA with Lexical Variations?0
AIDA: Artificial Intelligent Dialogue Agent0
Can predicate-argument relationships be extracted from UD trees?0
A Probabilistic-Logic based Commonsense Representation Framework for Modelling Inferences with Multiple Antecedents and Varying Likelihoods0
Evaluating the Ebb and Flow: An In-depth Analysis of Question-Answering Trends across Diverse Platforms0
Can Open Domain Question Answering Systems Answer Visual Knowledge Questions?0
A Probabilistic Lexical Model for Ranking Textual Inferences0
A Probabilistic Annotation Model for Crowdsourcing Coreference0
AIA-BDE: A Corpus of FAQs in Portuguese and their Variations0
Actively Seeking and Learning from Live Data0
Efficient Bilinear Attention-based Fusion for Medical Visual Question Answering0
Can Multimodal LLMs do Visual Temporal Understanding and Reasoning? The answer is No!0
A Pretraining Numerical Reasoning Model for Ordinal Constrained Question Answering on Knowledge Base0
Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis0
Can LLMs assist with Ambiguity? A Quantitative Evaluation of various Large Language Models on Word Sense Disambiguation0
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?0
Can Large Language Models Unveil the Mysteries? An Exploration of Their Ability to Unlock Information in Complex Scenarios0
Show:102550
← PrevPage 108 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified