SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 62516300 of 10817 papers

TitleStatusHype
Meta-Embeddings for Natural Language Inference and Semantic Similarity tasks0
Metaethical Perspectives on 'Benchmarking' AI Ethics0
Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs0
Metaheuristic Approaches to Lexical Substitution and Simplification0
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos0
Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities0
EfficientEQA: An Efficient Approach for Open Vocabulary Embodied Question Answering0
A corpus of general and specific sentences from news0
How to find a good image-text embedding for remote sensing visual question answering?0
Metamorphic Relation Based Adversarial Attacks on Differentiable Neural Computer0
Efficient Global Learning of Entailment Graphs0
Meta-prompting Optimized Retrieval-augmented Generation0
How to Evaluate Opinionated Keyphrase Extraction?0
MetaQA: Combining Expert Agents for Multi-Skill Question Answering0
MetaReflection: Learning Instructions for Language Agents using Past Reflections0
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification0
Continually Self-Improving Language Models for Bariatric Surgery Question--Answering0
A Few-Shot Learning Focused Survey on Recent Named Entity Recognition and Relation Classification Methods0
Method of Tibetan Person Knowledge Extraction0
Methods Combination and ML-based Re-ranking of Multiple Hypothesis for Question-Answering Systems0
Modular Graph Attention Network for Complex Visual Relational Reasoning0
Modulating Language Model Experiences through Frictions0
EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models0
A Survey on Recent Advances in Sequence Labeling from Deep Learning Models0
Efficiently Embedding Dynamic Knowledge Graphs0
MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering0
How to Design Sample and Computationally Efficient VQA Models0
How to Build an AI Tutor That Can Adapt to Any Course Using Knowledge Graph-Enhanced Retrieval-Augmented Generation (KG-RAG)0
MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models0
Book Review: Interactive Multi-Modal Question-Answering by Antal van den Bosch and Gosse Bouma0
MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages0
MIA 2022 Shared Task Submission: Leveraging Entity Representations, Dense-Sparse Hybrids, and Fusion-in-Decoder for Cross-Lingual Question Answering0
Continual Learning for Temporal-Sensitive Question Answering0
Efficient Models for the Detection of Hate, Abuse and Profanity0
MICRON: Multigranular Interaction for Contextualizing RepresentatiON in Non-factoid Question Answering0
Microsoft AI Challenge India 2018: Learning to Rank Passages for Web Question Answering with Deep Attention Networks0
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models0
How Susceptible are LLMs to Influence in Prompts?0
Miko Team: Deep Learning Approach for Legal Question Answering in ALQAC 20220
Mildly Non-Projective Dependency Grammar0
How State-Of-The-Art Models Can Deal With Long-Form Question Answering0
Continual Domain Adaptation for Machine Reading Comprehension0
How Stable is Knowledge Base Knowledge?0
MIMOQA: Multimodal Input Multimodal Output Question Answering0
How Self-Attention Improves Rare Class Performance in a Question-Answering Dialogue Agent0
Mindful-RAG: A Study of Points of Failure in Retrieval Augmented Generation0
A Multithreaded Conversational Interface for Pedestrian Navigation and Question Answering0
Accounting for Sycophancy in Language Model Uncertainty Estimation0
Modern Question Answering Datasets and Benchmarks: A Survey0
Contingency and Comparison Relation Labeling and Structure Prediction in Chinese Sentences0
Show:102550
← PrevPage 126 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified