SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 801850 of 10817 papers

TitleStatusHype
CC-Riddle: A Question Answering Dataset of Chinese Character RiddlesCode1
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene ManipulationCode1
CCQA: A New Web-Scale Question Answering Dataset for Model Pre-TrainingCode1
FILTER: An Enhanced Fusion Method for Cross-lingual Language UnderstandingCode1
Cerbero-7B: A Leap Forward in Language-Specific LLMs Through Enhanced Chat Corpus Generation and EvaluationCode1
CLEVR-X: A Visual Reasoning Dataset for Natural Language ExplanationsCode1
CBench: Towards Better Evaluation of Question Answering Over Knowledge GraphsCode1
Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequencesCode1
A Gradually Soft Multi-Task and Data-Augmented Approach to Medical Question UnderstandingCode1
Answer is All You Need: Instruction-following Text Embedding via Answering the QuestionCode1
CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question AnsweringCode1
Explaining NLP Models via Minimal Contrastive Editing (MiCE)Code1
Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question AnsweringCode1
CLIP-Guided Vision-Language Pre-training for Question Answering in 3D ScenesCode1
Explaining Question Answering Models through Text GenerationCode1
Clues Before Answers: Generation-Enhanced Multiple-Choice QACode1
CL-ReLKT: Cross-lingual Language Knowledge Transfer for Multilingual Retrieval Question AnsweringCode1
CLTR: An End-to-End, Transformer-Based System for Cell Level Table Retrieval and Table Question AnsweringCode1
Explaining Answers with Entailment TreesCode1
Causal Distillation for Language ModelsCode1
FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question AnsweringCode1
3D Vision and Language Pretraining with Large-Scale Synthetic DataCode1
ChainCQG: Flow-Aware Conversational Question GenerationCode1
Explaining Autonomous Driving Actions with Visual Question AnsweringCode1
Coarse-to-Fine Vision-Language Pre-training with Fusion in the BackboneCode1
Coarse-to-Fine Reasoning for Visual Question AnsweringCode1
CAT-ViL: Co-Attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic SurgeryCode1
AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image DetectorsCode1
CodeQA: A Question Answering Dataset for Source Code ComprehensionCode1
Forward Learning of Graph Neural NetworksCode1
Explicit Planning Helps Language Models in Logical ReasoningCode1
GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly DetectionCode1
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM CollaborationCode1
ConceptBert: Concept-Aware Representation for Visual Question AnsweringCode1
EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented GenerationCode1
AfriQA: Cross-lingual Open-Retrieval Question Answering for African LanguagesCode1
Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question AnsweringCode1
Combo of Thinking and Observing for Outside-Knowledge VQACode1
Example-Based Named Entity RecognitionCode1
Fine-tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over WikidataCode1
CommonsenseQA: A Question Answering Challenge Targeting Commonsense KnowledgeCode1
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real UsersCode1
EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question AnsweringCode1
CompAct: Compressing Retrieved Documents Actively for Question AnsweringCode1
A Personalized Dense Retrieval Framework for Unified Information AccessCode1
Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question AnsweringCode1
An Optimal Algorithm for Finding Champions in Tournament GraphsCode1
Complex Reasoning over Logical Queries on Commonsense Knowledge GraphsCode1
ABCD: A Graph Framework to Convert Complex Sentences to a Covering Set of Simple SentencesCode1
CARE: Collaborative AI-Assisted Reading EnvironmentCode1
Show:102550
← PrevPage 17 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified