SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1060110650 of 10817 papers

TitleStatusHype
Towards Creation of a Corpus for Argumentation Mining the Biomedical Genetics Research Literature0
Towards Data Distillation for End-to-end Spoken Conversational Question Answering0
Data Poisoning Attack against Knowledge Graph Embedding0
Toward Deconfounding the Influence of Entity Demographics for Question Answering Accuracy0
Towards Deep Learning in Hindi NER: An approach to tackle the Labelled Data Sparsity0
Towards Developing a Multilingual and Code-Mixed Visual Question Answering System by Knowledge Distillation0
Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models0
Towards Differential Relational Privacy and its use in Question Answering0
Towards Domain Adaptation from Limited Data for Question Answering Using Deep Neural Networks0
Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking0
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques0
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering0
Towards Faithful Response Generation for Chinese Table Question Answering0
Towards Few-Shot Fact-Checking via Perplexity0
Towards Fine-Grained Video Question Answering0
Towards Generalist Biomedical AI0
Towards Generalizable Methods for Automating Risk Score Calculation0
Towards Generalizable Neuro-Symbolic Systems for Commonsense Question Answering0
AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs0
Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture0
Towards Graph-hop Retrieval and Reasoning in Complex Question Answering over Textual Database0
Towards Graph Prompt Learning: A Survey and Beyond0
Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models0
Towards Harnessing Memory Networks for Coreference Resolution0
Towards Human-Level Understanding of Complex Process Engineering Schematics: A Pedagogical, Introspective Multi-Agent Framework for Open-Domain Question Answering0
Towards Identifying Hindi/Urdu Noun Templates in Support of a Large-Scale LFG Grammar0
Towards Investigating Biases in Spoken Conversational Search0
Towards Knowledge Graphs Validation through Weighted Knowledge Sources0
Towards leveraging latent knowledge and Dialogue context for real-world conversational question answering0
Towards leveraging LLMs for Conditional QA0
Towards Loosely-Coupling Knowledge Graph Embeddings and Ontology-based Reasoning0
Towards Mitigating Hallucination in Large Language Models via Self-Reflection0
Towards Model Driven Architectures for Human Language Technologies0
Towards Models that Can See and Read0
Towards Monetary Incentives in Social Q&A Services0
Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History0
Towards Natural Language Question Answering over Earth Observation Linked Data using Attention-based Neural Machine Translation0
Towards Omnidirectional Reasoning with 360-R1: A Dataset, Benchmark, and GRPO-based Method0
Towards Ontologically Grounded and Language-Agnostic Knowledge Graphs0
Towards Optimisation of Collaborative Question Answering over Knowledge Graphs0
Towards Optimizing the Costs of LLM Usage0
Towards Personalized Explanation of Robot Path Planning via User Feedback0
Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question Answering0
Towards Probabilistic Question Answering Over Tabular Data0
Towards Query Logs for Privacy Studies: On Deriving Search Queries from Questions0
Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks0
Towards Reasoning-Aware Explainable VQA0
Towards Reliable Medical Question Answering: Techniques and Challenges in Mitigating Hallucinations in Language Models0
Towards Retrieval Augmented Generation over Large Video Libraries0
Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language Models0
Show:102550
← PrevPage 213 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified