SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1050110550 of 10817 papers

TitleStatusHype
The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval0
The Silent Saboteur: Imperceptible Adversarial Attacks against Black-Box Retrieval-Augmented Generation Systems0
The Solution for the ICCV 2023 Perception Test Challenge 2023 -- Task 6 -- Grounded videoQA0
The State-of-the-Art in Lifelog Retrieval: A Review of Progress at the ACM Lifelog Search Challenge Workshop 2022-240
The TARSQI Toolkit0
The Turing Deception0
The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media0
The University of Texas at Dallas HLTRI's Participation in EPIC-QA: Searching for Entailed Questions Revealing Novel Answer Nuggets0
The Use of Dependency Relation Graph to Enhance the Term Weighting in Question Retrieval0
The Value of Semantic Parse Labeling for Knowledge Base Question Answering0
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR0
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions0
The WDAqua ITN: Answering Questions using Web Data0
The Web as an Implicit Training Set: Application to Noun Compounds Syntax and Semantics0
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA0
Think before you speak: Training Language Models With Pause Tokens0
Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge0
This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering0
Thought Flow Nets: From Single Predictions to Trains of Model Thought0
Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation0
Thread-Level Information for Comment Classification in Community Question Answering0
Thread Specific Features are Helpful for Identifying Subjectivity Orientation of Online Forum Threads0
Threshold-Based Retrieval and Textual Entailment Detection on Legal Bar Exam Questions0
Tiantianzhu7:System Description of Semantic Textual Similarity (STS) in the SemEval-2012 (Task 6)0
Tianyi: A Traditional Chinese Medicine all-rounder language model and its Real-World Clinical Practice0
TIGQA:An Expert Annotated Question Answering Dataset in Tigrinya0
TI-JEPA: An Innovative Energy-based Joint Embedding Strategy for Text-Image Multimodal Systems0
Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering0
TimeLogic: A Temporal Logic Benchmark for Video QA0
Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement0
TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs0
TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation0
TinyDrive: Multiscale Visual Question Answering with Selective Token Routing for Autonomous Driving0
TinyRS-R1: Compact Multimodal Language Model for Remote Sensing0
TinyVQA: Compact Multimodal Deep Neural Network for Visual Question Answering on Resource-Constrained Devices0
Tip of the Tongue Known-Item Retrieval: A Case Study in Movie Identification0
'Tis but Thy Name: Semantic Question Answering Evaluation with 11M Names for 1M Entities0
T-Know: a Knowledge Graph-based Question Answering and Infor-mation Retrieval System for Traditional Chinese Medicine0
TM-PATHVQA:90000+ Textless Multilingual Questions for Medical Visual Question Answering0
To Adapt or to Annotate: Challenges and Interventions for Domain Adaptation in Open-Domain Question Answering0
Together we stand: Siamese Networks for Similar Question Retrieval0
TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs0
Tokenization Preference for Human and Machine Learning Model: An Annotation Study0
Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs0
Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models0
Tools for plWordNet Development. Presentation and Perspectives0
Tools in the Loop: Quantifying Uncertainty of LLM Question Answering Systems That Use Tools0
Top-down Activity Representation Learning for Video Question Answering0
Topical Segmentation: a Study of Human Performance and a New Measure of Quality.0
Topic-Based Question Generation0
Show:102550
← PrevPage 211 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified