SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1015110200 of 10817 papers

TitleStatusHype
ChartCards: A Chart-Metadata Generation Framework for Multi-Task Chart UnderstandingCode0
Progressive Multi-granular Alignments for Grounded Reasoning in Large Vision-Language ModelsCode0
Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative ModelsCode0
Attention Instruction: Amplifying Attention in the Middle via PromptingCode0
Character-Level Question Answering with AttentionCode0
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP ModelsCode0
ActBERT: Learning Global-Local Video-Text RepresentationsCode0
OmniDRCA: Parallel Speech-Text Foundation Model via Dual-Resolution Speech Representations and Contrastive AlignmentCode0
Efficient and Robust Question Answering from Minimal Context over DocumentsCode0
MAFiD: Moving Average Equipped Fusion-in-Decoder for Question Answering over Tabular and Textual DataCode0
Characterising Topic Familiarity and Query Specificity Using Eye-Tracking DataCode0
Character Identification on Multiparty Conversation: Identifying Mentions of Characters in TV ShowsCode0
OmniNet: A unified architecture for multi-modal multi-task learningCode0
Attention-Based Bidirectional Long Short-Term Memory Networks for Relation ClassificationCode0
Efficient and Interpretable Information Retrieval for Product Question Answering with Heterogeneous DataCode0
RankAlign: A Ranking View of the Generator-Validator Gap in Large Language ModelsCode0
Challenges in Generalization in Open Domain Question AnsweringCode0
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language ModelsCode0
Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoningCode0
Dual-Encoders for Extreme Multi-Label ClassificationCode0
Make Text Unlearnable: Exploiting Effective Patterns to Protect Personal DataCode0
Prometheus Chatbot: Knowledge Graph Collaborative Large Language Model for Computer Components RecommendationCode0
Effective Few-Shot Named Entity Linking by Meta-LearningCode0
Promoting Generalized Cross-lingual Question Answering in Few-resource Scenarios via Self-knowledge DistillationCode0
Effective Approaches to Batch Parallelization for Dynamic Neural Network ArchitecturesCode0
A mathematical model for universal semanticsCode0
On Bits and Bandits: Quantifying the Regret-Information Trade-offCode0
CERET: Cost-Effective Extrinsic Refinement for Text GenerationCode0
EEE-QA: Exploring Effective and Efficient Question-Answer RepresentationsCode0
Mamba Fusion: Learning Actions Through QuestioningCode0
On Curriculum Learning for Commonsense ReasoningCode0
MaMMUT: A Simple Architecture for Joint Learning for MultiModal TasksCode0
Answer Retrieval in Legal Community Question AnsweringCode0
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language ModelsCode0
MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language ModelsCode0
Decomposed Prompting to Answer Questions on a Course Discussion BoardCode0
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential ReasoningCode0
EaSe: A Diagnostic Tool for VQA based on Answer DiversityCode0
CAVE: Correcting Attribute Values in E-commerce ProfilesCode0
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge GraphsCode0
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation ModelsCode0
A Mathematical Framework, a Taxonomy of Modeling Paradigms, and a Suite of Learning Techniques for Neural-Symbolic SystemsCode0
Causal Question Answering with Reinforcement LearningCode0
DyREx: Dynamic Query Representation for Extractive Question AnsweringCode0
Dynamic Task and Weight Prioritization Curriculum Learning for Multimodal ImageryCode0
Dynamic Memory Networks for Visual and Textual Question AnsweringCode0
Mapping distributional to model-theoretic semantic spaces: a baselineCode0
CausalQA: A Benchmark for Causal Question AnsweringCode0
Prompt-based Zero-shot Relation Extraction with Semantic Knowledge AugmentationCode0
Causal Graphs Meet Thoughts: Enhancing Complex Reasoning in Graph-Augmented LLMsCode0
Show:102550
← PrevPage 204 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified