SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 58015850 of 10817 papers

TitleStatusHype
XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language ModelsCode0
Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering0
Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction0
HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images0
Ensemble Transfer Learning for Multilingual Coreference Resolution0
Rationalization for Explainable NLP: A Survey0
Weakly-Supervised Questions for Zero-Shot Relation ExtractionCode0
Reversing The Twenty Questions Game0
Towards Models that Can See and Read0
Temporal Perceiving Video-Language Pre-training0
Curriculum Script Distillation for Multilingual Visual Question Answering0
Explaining ELH Concept Descriptions through Counterfactual Reasoning0
Towards Answering Climate Questionnaires from Unstructured Climate ReportsCode0
Semantic Web Enabled Geographic Question Answering Framework: GeoTR0
There is No Big Brother or Small Brother: Knowledge Infusion in Language Models for Link Prediction and Question AnsweringCode0
Language Models sounds the Death Knell of Knowledge Graphs0
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models0
MAQA: A Multimodal QA Benchmark for Negation0
A Brain-inspired Memory Transformation based Differentiable Neural Computer for Reasoning-based Question Answering0
Knowledge Reasoning via Jointly Modeling Knowledge Graphs and Soft Rules0
RLAS-BIABC: A Reinforcement Learning-Based Answer Selection Using the BERT Model Boosted by an Improved ABC Algorithm0
Adaptively Clustering Neighbor Elements for Image-Text GenerationCode0
Emotion-Cause Pair Extraction as Question Answering0
Learning Trajectory-Word Alignments for Video-Language Tasks0
Topic Segmentation Model Focusing on Local Context0
PIE-QG: Paraphrased Information Extraction for Unsupervised Question Generation from Small Corpora0
From Images to Textual Prompts: Zero-Shot Visual Question Answering With Frozen Large Language Models0
Exploring the Effect of Primitives for Compositional Generalization in Vision-and-LanguageCode0
Exploring Temporal Concurrency for Video-Language Representation LearningCode0
RMLVQA: A Margin Loss Approach for Visual Question Answering With Language Biases0
Discovering the Real Association: Multimodal Causal Reasoning in Video Question Answering0
Knowledge Proxy Intervention for Deconfounded Video Question Answering0
Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering0
IS-GGT: Iterative Scene Graph Generation With Generative Transformers0
PromptCap: Prompt-Guided Image Captioning for VQA with GPT-30
Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks0
Toward Multi-Granularity Decision-Making: Explicit Visual Reasoning with Hierarchical KnowledgeCode0
GPTR: Gestalt-Perception Transformer for Diagram Object Detection0
A Survey on Table-and-Text HybridQA: Concepts, Methods, Challenges and Future Directions0
Improving Complex Knowledge Base Question Answering via Question-to-Action and Question-to-Question AlignmentCode0
STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension0
When are Lemons Purple? The Concept Association Bias of Vision-Language Models0
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language ModelsCode0
UnICLAM:Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering0
Language models are better than humans at next-token predictionCode0
ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language Models0
Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question AnsweringCode0
Extrinsic Evaluation of Machine Translation Metrics0
On-the-fly Denoising for Data Augmentation in Natural Language UnderstandingCode0
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue0
Show:102550
← PrevPage 117 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified