SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 88018850 of 10817 papers

TitleStatusHype
From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering0
From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason0
From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems0
From text to multimodal: a survey of adversarial example generation in question answering systems0
From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics0
From Textual Entailment to Knowledgeable Machines0
CRIC: A VQA Dataset for Compositional Reasoning on Vision and Commonsense0
From Visual to Acoustic Question Answering0
From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment Technique0
Frustratingly Easy Model Ensemble for Abstractive Summarization0
Frustratingly Easy Natural Question Answering0
Frustratingly Hard Evidence Retrieval for QA Over Books0
FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering0
Full Machine Translation for Factoid Question Answering0
Full-Time Supervision based Bidirectional RNN for Factoid Question Answering0
FunBench: Benchmarking Fundus Reading Skills of MLLMs0
Functorial Language Games for Question Answering0
FuRongWang at SemEval-2017 Task 3: Deep Neural Networks for Selecting Relevant Answers in Community Question Answering0
Furthest Reasoning with Plan Assessment: Stable Reasoning Path with Retrieval-Augmented Large Language Models0
Fuse and Adapt: Investigating the Use of Pre-Trained Self-Supervising Learning Models in Limited Data NLU problems0
Fusing Bidirectional Chains of Thought and Reward Mechanisms A Method for Enhancing Question-Answering Capabilities of Large Language Models for Chinese Intangible Cultural Heritage0
Fusing Temporal Graphs into Transformers for Time-Sensitive Question Answering0
FusionMind -- Improving question and answering with external context fusion0
Fusion of Detected Objects in Text for Visual Question Answering0
Fusion of Domain-Adapted Vision and Language Models for Medical Visual Question Answering0
FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering0
FVQA: Fact-based Visual Question Answering0
Game-theoretic Vocabulary Selection via the Shapley Value and Banzhaf Index0
GANDALF: a General Character Name Description Dataset for Long Fiction0
Gated Group Self-Attention for Answer Selection0
Gated Self-Matching Networks for Reading Comprehension and Question Answering0
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records0
Gaussian Attention Model and Its Application to Knowledge Base Embedding and Question Answering0
GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance0
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation0
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis0
GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning0
Gemini Pro Defeated by GPT-4V: Evidence from Education0
GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning0
GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation0
GenDec: A robust generative Question-decomposition method for Multi-hop reasoning0
Gender and Racial Bias in Visual Question Answering Datasets0
Gender and Racial Stereotype Detection in Legal Opinion Word Embeddings0
General Embedding vs. Task-Specific Embedding: A Comparative Approach to Enhancing NLP Performance0
Generalizable Neuro-symbolic Systems for Commonsense Question Answering0
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems0
Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction0
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data0
Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness0
Generalized Hadamard-Product Fusion Operators for Visual Question Answering0
Show:102550
← PrevPage 177 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified