SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 99019925 of 10817 papers

TitleStatusHype
Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) TimeCode0
Lexicalization Is All You Need: Examining the Impact of Lexical Knowledge in a Compositional QALD SystemCode0
Neural Stored-program MemoryCode0
Scaling Reasoning can Improve Factuality in Large Language ModelsCode0
QUITE: Quantifying Uncertainty in Natural Language Text in Bayesian Reasoning ScenariosCode0
Addressing Issues of Cross-Linguality in Open-Retrieval Question Answering Systems For Emergent DomainsCode0
Expanding End-to-End Question Answering on Differentiable Knowledge Graphs with IntersectionCode0
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language UnderstandingCode0
LGAR: Zero-Shot LLM-Guided Neural Ranking for Abstract Screening in Systematic Literature ReviewsCode0
Quizbowl: The Case for Incremental Question AnsweringCode0
EXAQ: Exponent Aware Quantization For LLMs AccelerationCode0
CODAH: An Adversarially-Authored Question Answering Dataset for Common SenseCode0
Neural Variational Inference for Text ProcessingCode0
ExAnte: A Benchmark for Ex-Ante Inference in Large Language ModelsCode0
Examining Gender and Racial Bias in Large Vision-Language Models Using a Novel Dataset of Parallel ImagesCode0
Relation-Aware Graph Attention Network for Visual Question AnsweringCode0
Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question AnsweringCode0
Neurocache: Efficient Vector Retrieval for Long-range Language ModelingCode0
Augmenting Visual Question Answering with Semantic Frame Information in a Multitask Learning ApproachCode0
QPaug: Question and Passage Augmentation for Open-Domain Question Answering of LLMsCode0
Relation-aware Hierarchical Attention Framework for Video Question AnsweringCode0
A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete ReasoningCode0
Lightweight Recurrent Cross-modal Encoder for Video Question AnsweringCode0
Likelihood as a Performance Gauge for Retrieval-Augmented GenerationCode0
Evidence Sentence Extraction for Machine Reading ComprehensionCode0
Show:102550
← PrevPage 397 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified