SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 64516475 of 10817 papers

TitleStatusHype
How do QA models combine knowledge from LM and 100 passages?0
How do Negation and Modality Impact on Opinions?0
Contextualized Embeddings based Convolutional Neural Networks for Duplicate Question Identification0
Modelling Long-distance Node Relations for KBQA with Global Dynamic Graph0
A Survey on Large Language Models with some Insights on their Capabilities and Limitations0
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants0
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks0
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models0
Multimodal Commonsense Knowledge Distillation for Visual Question Answering0
Modular Blended Attention Network for Video Question Answering0
How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations0
Modular Graph Attention Network for Complex Visual Relational Reasoning0
Contextualized Attention-based Knowledge Transfer for Spoken Conversational Question Answering0
Contextual Evaluation of Large Language Models for Classifying Tropical and Infectious Diseases0
A Survey on Knowledge-Oriented Retrieval-Augmented Generation0
Modulating Language Model Experiences through Frictions0
How Context Affects Language Models' Factual Predictions0
A Multi-Source Retrieval Question Answering Framework Based on RAG0
How Can Objects Help Video-Language Understanding?0
A Survey on Knowledge Graph Embeddings with Literals: Which model links better Literal-ly?0
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting0
Bridging the Gap Between Information Seeking and Product Search Systems: Q&A Recommendation for E-commerce0
Accounting for Focus Ambiguity in Visual Questions0
Mondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing0
Multimedia Summary Generation from Online Conversations: Current Approaches and Future Directions0
Show:102550
← PrevPage 259 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified