SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1005110075 of 10817 papers

TitleStatusHype
SERC: Syntactic and Semantic Sequence based Event Relation Classification0
Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures0
SetKE: Knowledge Editing for Knowledge Elements Overlap0
Set-LLM: A Permutation-Invariant LLM0
SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services0
SG-FSM: A Self-Guiding Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Based on Finite State Machine0
Shai: A large language model for asset management0
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments0
Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition0
Shared Imagination: LLMs Hallucinate Alike0
Sheffield MultiMT: Using Object Posterior Predictions for Multimodal Machine Translation0
SHEF-Multimodal: Grounding Machine Translation on Images0
Shifting the Baseline: Single Modality Performance on Visual Navigation & QA0
SHIKEBLCU at SemEval-2020 Task 2: An External Knowledge-enhanced Matrix for Multilingual and Cross-Lingual Lexical Entailment0
Shiraz: A Proposed List Wise Approach to Answer Validation0
SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering0
Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention0
Siamese Networks for Semantic Pattern Similarity0
GPT-4o as the Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data0
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions0
SILC: Improving Vision Language Pretraining with Self-Distillation0
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making0
Silver Retriever: Advancing Neural Passage Retrieval for Polish Question Answering0
SimBow at SemEval-2017 Task 3: Soft-Cosine Semantic Similarity between Questions for Community Question Answering0
SimDoc: Topic Sequence Alignment based Document Similarity Framework0
Show:102550
← PrevPage 403 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified