SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 62516275 of 10817 papers

TitleStatusHype
How to Evaluate Opinionated Keyphrase Extraction?0
Metaethical Perspectives on 'Benchmarking' AI Ethics0
Continually Self-Improving Language Models for Bariatric Surgery Question--Answering0
Metaheuristic Approaches to Lexical Substitution and Simplification0
A Few-Shot Learning Focused Survey on Recent Named Entity Recognition and Relation Classification Methods0
MetaICL: Learning to Learn In Context0
EfficientEQA: An Efficient Approach for Open Vocabulary Embodied Question Answering0
A corpus of general and specific sentences from news0
Modeling Exemplification in Long-form Question Answering via Retrieval0
Metamorphic Relation Based Adversarial Attacks on Differentiable Neural Computer0
Efficient Global Learning of Entailment Graphs0
Meta-prompting Optimized Retrieval-augmented Generation0
Modeling Multi-hop Question Answering as Single Sequence Prediction0
MetaQA: Combining Expert Agents for Multi-Skill Question Answering0
MetaReflection: Learning Instructions for Language Agents using Past Reflections0
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification0
A Survey on Recent Advances in Sequence Labeling from Deep Learning Models0
How to Design Sample and Computationally Efficient VQA Models0
Method of Tibetan Person Knowledge Extraction0
Methods Combination and ML-based Re-ranking of Multiple Hypothesis for Question-Answering Systems0
Modeling Context in Answer Sentence Selection Systems on a Latency Budget0
How to Build an AI Tutor That Can Adapt to Any Course Using Knowledge Graph-Enhanced Retrieval-Augmented Generation (KG-RAG)0
Continual Learning for Temporal-Sensitive Question Answering0
Modeling Coreference Relations in Visual Dialog0
How Susceptible are LLMs to Influence in Prompts?0
Show:102550
← PrevPage 251 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified