SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 88268850 of 10817 papers

TitleStatusHype
FVQA: Fact-based Visual Question Answering0
Game-theoretic Vocabulary Selection via the Shapley Value and Banzhaf Index0
GANDALF: a General Character Name Description Dataset for Long Fiction0
Gated Group Self-Attention for Answer Selection0
Gated Self-Matching Networks for Reading Comprehension and Question Answering0
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records0
Gaussian Attention Model and Its Application to Knowledge Base Embedding and Question Answering0
GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance0
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation0
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis0
GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning0
Gemini Pro Defeated by GPT-4V: Evidence from Education0
GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning0
GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation0
GenDec: A robust generative Question-decomposition method for Multi-hop reasoning0
Gender and Racial Bias in Visual Question Answering Datasets0
Gender and Racial Stereotype Detection in Legal Opinion Word Embeddings0
General Embedding vs. Task-Specific Embedding: A Comparative Approach to Enhancing NLP Performance0
Generalizable Neuro-symbolic Systems for Commonsense Question Answering0
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems0
Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction0
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data0
Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness0
Generalized Hadamard-Product Fusion Operators for Visual Question Answering0
Generalizing Question Answering System with Pre-trained Language Model Fine-tuning0
Show:102550
← PrevPage 354 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified