SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1007610100 of 10817 papers

TitleStatusHype
End-to-End Open-Domain Question Answering with BERTseriniCode0
End-to-End Instance Segmentation with Recurrent AttentionCode0
End-to-End Goal-Driven Web NavigationCode0
Re-Examining Calibration: The Case of Question AnsweringCode0
Long Story Short: a Summarize-then-Search Method for Long Video Question AnsweringCode0
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video FeaturesCode0
A Memory-Network Based Solution for Multivariate Time-Series ForecastingCode0
Revisiting CroPA: A Reproducibility Study and Enhancements for Cross-Prompt Adversarial Transferability in Vision-Language ModelsCode0
End-Task Oriented Textual Entailment via Deep Explorations of Inter-Sentence InteractionsCode0
Efficient Encoder-Decoder Transformer Decoding for Decomposable TasksCode0
NukeBERT: A Pre-trained language model for Low Resource Nuclear DomainCode0
emrKBQA: A Clinical Knowledge-Base Question Answering DatasetCode0
Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrationsCode0
Emotion Twenty Questions Dialog System for Lexical Emotional IntelligenceCode0
Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context ExpansionCode0
Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language ModelsCode0
Co-occurrence is not Factual Association in Language ModelsCode0
Numerical Reasoning for Financial ReportsCode0
A Parallel-Hierarchical Model for Machine Comprehension on Sparse DataCode0
NumNet: Machine Reading Comprehension with Numerical ReasoningCode0
Looking Beyond Visible Cues: Implicit Video Question Answering via Dual-Clue ReasoningCode0
Classification of telicity using cross-linguistic annotation projectionCode0
C-HTS: A Concept-based Hierarchical Text Segmentation approachCode0
Emerging Challenges in Personalized Medicine: Assessing Demographic Effects on Biomedical Question Answering SystemsCode0
CHQ-Summ: A Dataset for Consumer Healthcare Question SummarizationCode0
Show:102550
← PrevPage 404 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified