SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 95269550 of 10817 papers

TitleStatusHype
Context-aware Frame-Semantic Role LabelingCode0
Language Fusion for Parameter-Efficient Cross-lingual TransferCode0
Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A PlatformsCode0
Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and BeyondCode0
Recursive Visual Attention in Visual DialogCode0
From Roots to Rewards: Dynamic Tree Reasoning with RLCode0
Analyzing Vietnamese Legal Questions Using Deep Neural Networks with Biaffine ClassifiersCode0
PEYMA: A Tagged Corpus for Persian Named EntitiesCode0
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERTCode0
Robust and Scalable Differentiable Neural Computer for Question AnsweringCode0
Language Model Knowledge Distillation for Efficient Question Answering in SpanishCode0
Abductive Commonsense ReasoningCode0
Language models are better than humans at next-token predictionCode0
SemEval-2019 Task 8: Fact Checking in Community Question Answering ForumsCode0
From Philosophy to Interfaces: an Explanatory Method and a Tool Inspired by Achinstein's Theory of ExplanationCode0
From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question AnsweringCode0
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language ModelsCode0
Constraint-Based Question Answering with Knowledge GraphCode0
Multimodal Preference Data Synthetic Alignment with Reward ModelCode0
Language Models as Knowledge Bases?Code0
Language Models as Knowledge Bases for Visual Word Sense DisambiguationCode0
From Feature Importance to Natural Language Explanations Using LLMs with RAGCode0
Consistency Training by Synthetic Question Generation for Conversational Question AnsweringCode0
Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal ReasoningCode0
From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine ReaderCode0
Show:102550
← PrevPage 382 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified