SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1005110100 of 10817 papers

TitleStatusHype
Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures0
SetKE: Knowledge Editing for Knowledge Elements Overlap0
Set-LLM: A Permutation-Invariant LLM0
SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services0
SG-FSM: A Self-Guiding Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Based on Finite State Machine0
Shai: A large language model for asset management0
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments0
Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition0
Shared Imagination: LLMs Hallucinate Alike0
Sheffield MultiMT: Using Object Posterior Predictions for Multimodal Machine Translation0
SHEF-Multimodal: Grounding Machine Translation on Images0
Shifting the Baseline: Single Modality Performance on Visual Navigation & QA0
SHIKEBLCU at SemEval-2020 Task 2: An External Knowledge-enhanced Matrix for Multilingual and Cross-Lingual Lexical Entailment0
Shiraz: A Proposed List Wise Approach to Answer Validation0
SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering0
Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention0
Siamese Networks for Semantic Pattern Similarity0
GPT-4o as the Gold Standard: A Scalable and General Purpose Approach to Filter Language Model Pretraining Data0
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions0
SILC: Improving Vision Language Pretraining with Self-Distillation0
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making0
Silver Retriever: Advancing Neural Passage Retrieval for Polish Question Answering0
SimBow at SemEval-2017 Task 3: Soft-Cosine Semantic Similarity between Questions for Community Question Answering0
SimDoc: Topic Sequence Alignment based Document Similarity Framework0
Similarity-Based Reconstruction Loss for Meaning Representation0
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product Attribute Extraction0
Simple and Effective Semi-Supervised Question Answering0
Simple and Effective Unsupervised Redundancy Elimination to Compress Dense Vectors for Passage Retrieval0
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps0
Simple Large-scale Relation Extraction from Unstructured Text0
SimpleLLM4AD: An End-to-End Vision-Language Model with Graph Visual Question Answering for Autonomous Driving0
Simple or Complex? Classifying Questions by Answering Complexity0
Simple Question Answering by Attentive Convolutional Neural Network0
Simple Question Answering with Subgraph Ranking and Joint-Scoring0
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models0
Simple yet Effective Bridge Reasoning for Open-Domain Multi-Hop Question Answering0
Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases0
Simplifying Sparse Expert Recommendation by Revisiting Graph Diffusion0
SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset0
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains0
SimsterQ: A Similarity based Clustering Approach to Opinion Question Answering0
Simulating Bandit Learning from User Feedback for Extractive Question Answering0
SimVQA: Exploring Simulated Environments for Visual Question Answering0
Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization0
Single-Modal Entropy based Active Learning for Visual Question Answering0
Single Training Dimension Selection for Word Embedding with PCA0
SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback0
SIRIUS-LTG: An Entity Linking Approach to Fact Extraction and Verification0
SITE: towards Spatial Intelligence Thorough Evaluation0
SkillQG: Learning to Generate Question for Reading Comprehension Assessment0
Show:102550
← PrevPage 202 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified