SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 70267050 of 10817 papers

TitleStatusHype
A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT0
Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation0
OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context0
Accelerating Manufacturing Scale-Up from Material Discovery Using Agentic Web Navigation and Retrieval-Augmented AI for Process Engineering Schematics Design0
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization0
Opinion Holder and Target Extraction on Opinion Compounds – A Linguistic Approach0
Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers0
On-Demand Distributional Semantic Distance and Paraphrasing0
On-demand Injection of Lexical Knowledge for Recognising Textual Entailment0
Handling Multiword Expressions in Causality Estimation0
OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities0
Handling Anomalies of Synthetic Questions in Unsupervised Question Answering0
Hand in Glove: Deep Feature Fusion Network Architectures for Answer Quality Prediction in Community Question Answering0
HAMMR: HierArchical MultiModal React agents for generic VQA0
A Supervised Approach for Enriching the Relational Structure of Frame Semantics in FrameNet0
A Multi-answer Multi-task Framework for Real-world Machine Reading Comprehension0
Evaluation of medium-large Language Models at zero-shot closed book generative question answering0
Opinion Mining with Deep Recurrent Neural Networks0
OneStop QAMaker: Extract Question-Answer Pairs from Text in a One-Stop Approach0
Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering0
On Evaluating Embedding Models for Knowledge Base Completion0
On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering0
One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations0
Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces0
ConSens: Assessing context grounding in open-book question answering0
Show:102550
← PrevPage 282 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified