SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 73017350 of 10817 papers

TitleStatusHype
ParaQuery: Making Sense of Paraphrase Collections0
Exploring Advanced Techniques for Visual Question Answering: A Comprehensive Comparison0
Exploring and Evaluating Personalized Models for Code Generation0
Chain Based RNN for Relation Classification0
A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection0
Parsing with Traces: An O(n4) Algorithm and a Structural Representation0
Point-of-Interest Oriented Question Answering with Joint Inference of Semantic Matching and Distance Correlation0
Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting0
Diff-Explainer: Differentiable Convex Optimization for Explainable Multi-hop Inference0
Partially Fake Audio Detection by Self-attention-based Fake Span Discovery0
A Study of the Importance of External Knowledge in the Named Entity Recognition Task0
Guess What: A Question Answering Game via On-demand Knowledge Validation0
Passage Segmentation of Documents for Extractive Question Answering0
Exploring Diverse Expressions for Paraphrase Generation0
Patch-level Sounding Object Tracking for Audio-Visual Question Answering0
Exploring Diverse Methods in Visual Question Answering0
Amrita\_CEN at SemEval-2016 Task 1: Semantic Relation from Word Embeddings in Higher Dimension0
Pathological Visual Question Answering0
PMI-cool at SemEval-2016 Task 3: Experiments with PMI and Goodness Polarity Lexicons for Community Question Answering0
PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering0
POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset0
PathVLM-R1: A Reinforcement Learning-Driven Reasoning Model for Pathology Visual-Language Tasks0
Guess Me if You Can: Acronym Disambiguation for Enterprises0
Conditional Generation with a Question-Answering Blueprint0
PAT: Parallel Attention Transformer for Visual Question Answering in Vietnamese0
PAT-Questions: A Self-Updating Benchmark for Present-Anchored Temporal Question-Answering0
Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost0
GTR-LSTM: A Triple Encoder for Sentence Generation from RDF Data0
A Study of the Effect of Resolving Negation and Sentiment Analysis in Recognizing Text Entailment for Arabic0
PATTY: A Taxonomy of Relational Patterns with Semantic Types0
Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance0
GTR: Graph-Table-RAG for Cross-Table Question Answering0
gTBLS: Generating Tables from Text by Conditional Question Answering0
GSQA: An End-to-End Model for Generative Spoken Question Answering0
G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning0
Pay More Attention to Relation Exploration for Knowledge Base Question Answering0
Pay-Per-Request Deployment of Neural Network Models Using Serverless Architectures0
AMR Beyond the Sentence: the Multi-sentence AMR corpus0
PCQPR: Proactive Conversational Question Planning with Reflection0
PokemonChat: Auditing ChatGPT for Pokémon Universe Knowledge0
PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering0
PDFTriage: Question Answering over Long, Structured Documents0
GRS-QA -- Graph Reasoning-Structured Question Answering Dataset0
Growing Multi-Domain Glossaries from a Few Seeds using Probabilistic Topic Models0
PEACE: Empowering Geologic Map Holistic Understanding with MLLMs0
PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models0
Grow-and-Clip: Informative-yet-Concise Evidence Distillation for Answer Explanation0
PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph0
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?0
Grover's Algorithm for Question Answering0
Show:102550
← PrevPage 147 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified