SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 98019850 of 10817 papers

TitleStatusHype
Winnowing Knowledge for Multi-choice Question Answering0
WISDOM X, DISAANA and D-SUMM: Large-scale NLP Systems for Analyzing Textual Big Data0
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models0
WixQA: A Multi-Dataset Benchmark for Enterprise Retrieval-Augmented Generation0
Wizard of Tasks: A Novel Conversational Dataset for Solving Real-World Tasks in Conversational Settings0
WoLF: Wide-scope Large Language Model Framework for CXR Understanding0
Word and Phrase Features in Graph Convolutional Network for Automatic Question Classification0
Word Clustering Based on Un-LP Algorithm0
Word Embedding based Correlation Model for Question/Answer Matching0
Word Embedding-based Text Processing for Comprehensive Summarization and Distinct Information Extraction0
Word Embeddings as Features for Supervised Coreference Resolution0
WordNet---Wikipedia---Wiktionary: Construction of a Three-way Alignment0
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond0
Word Similarity Datasets for Indian Languages: Annotation and Baseline Systems0
Worldly Wise (WoW) - Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering0
WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-Hop Inference0
WorldTree V2: A Corpus of Science-Domain Structured Explanations and Inference Patterns supporting Multi-Hop Inference0
Writing your own book: A method for going from closed to open book QA to improve robustness and performance of smaller LLMs0
WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models0
XAI-CLASS: Explanation-Enhanced Text Classification with Extremely Weak Supervision0
XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering0
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference0
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation Aspects0
XF2T: Cross-lingual Fact-to-Text Generation for Low-Resource Languages0
xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs0
xGQA: Cross-Lingual Visual Question Answering0
XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering0
xLiD-Lexica: Cross-lingual Linked Data Lexica0
XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based Textual Knowledge Source0
XLTime: A Cross-Lingual Knowledge Transfer Framework for Zero-Shot Low-Resource Language Temporal Expression Extraction0
XLTime: A Cross-Lingual Knowledge Transfer Framework for Temporal Expression Extraction0
xMoCo: Cross Momentum Contrastive Learning for Open-Domain Question Answering0
XTE: Explainable Text Entailment0
X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model0
YA-TA: Towards Personalized Question-Answering Teaching Assistants using Instructor-Student Dual Retrieval-augmented Knowledge Fusion0
YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering0
Yet Another Language Identifier0
yiGou: A Semantic Text Similarity Computing System Based on SVM0
Yin and Yang: Balancing and Answering Binary Visual Questions0
YNUDLG at IJCNLP-2017 Task 5: A CNN-LSTM Model with Attention for Multi-choice Question Answering in Examinations0
YNU-HPCC at IJCNLP-2017 Task 4: Attention-based Bi-directional GRU Model for Customer Feedback Analysis Task of English0
YNU-HPCC at IJCNLP-2017 Task 5: Multi-choice Question Answering in Exams Using an Attention-based LSTM Model0
YNU-HPCC at Semeval-2018 Task 11: Using an Attention-based CNN-LSTM for Machine Comprehension using Commonsense Knowledge0
YNU-HPCC at SemEval-2018 Task 12: The Argument Reasoning Comprehension Task Using a Bi-directional LSTM with Attention Model0
You Can Do Better! If You Elaborate the Reason When Making Prediction0
YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension0
You Only Need One Model for Open-domain Question Answering0
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector0
ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue0
Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression0
Show:102550
← PrevPage 197 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified