Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 426–450 of 10817 papers

Title	Date	Tasks	Status	Hype
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models	Nov 28, 2023	Image CaptioningQuestion Answering	CodeCode Available	2
LLMGA: Multimodal Large Language Model based Generation Assistant	Nov 27, 2023	Image GenerationLanguage Modeling	CodeCode Available	2
GeoChat: Grounded Large Vision-Language Model for Remote Sensing	Nov 24, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	2
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design	Nov 23, 2023	Decision MakingLanguage Modelling	CodeCode Available	2
PG-Video-LLaVA: Pixel Grounding Large Video-Language Models	Nov 22, 2023	BenchmarkingPhrase Grounding	CodeCode Available	2
An Embodied Generalist Agent in 3D World	Nov 18, 2023	3D dense captioning3D Question Answering (3D-QA)	CodeCode Available	2
Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional Training	Nov 15, 2023	Passage RetrievalPosition	CodeCode Available	2
Learning to Filter Context for Retrieval-Augmented Generation	Nov 14, 2023	Extractive Question-AnsweringFact Verification	CodeCode Available	2
Agent Lumos: Unified and Modular Training for Open-Source Language Agents	Nov 9, 2023	MathQuestion Answering	CodeCode Available	2
DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning	Oct 23, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
Frozen Transformers in Language Models Are Effective Visual Encoder Layers	Oct 19, 2023	Action RecognitionImage-text Retrieval	CodeCode Available	2
From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models	Oct 13, 2023	HallucinationImage Captioning	CodeCode Available	2
ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models	Oct 13, 2023	Knowledge Base Question AnsweringKnowledge Graphs	CodeCode Available	2
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models	Oct 12, 2023	Natural Language UnderstandingQuantization	CodeCode Available	2
Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models	Oct 11, 2023	Code GenerationImage Generation	CodeCode Available	2
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning	Oct 10, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
Compressing Context to Enhance Inference Efficiency of Large Language Models	Oct 9, 2023	ArticlesQuestion Answering	CodeCode Available	2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models	Oct 6, 2023	Code GenerationDecision Making	CodeCode Available	2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	Oct 3, 2023	ChatbotImage Captioning	CodeCode Available	2
Representation Engineering: A Top-Down Approach to AI Transparency	Oct 2, 2023	Question Answering	CodeCode Available	2
Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering	Sep 29, 2023	Image to textPassage Retrieval	CodeCode Available	2
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding	Sep 20, 2023	Chart Question AnsweringChart Understanding	CodeCode Available	2
Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite	Sep 15, 2023	Question Answering	CodeCode Available	2
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following	Sep 1, 2023	3D Generation3D Question Answering (3D-QA)	CodeCode Available	2
Knowledge Graph Prompting for Multi-Document Question Answering	Aug 22, 2023	graph constructionOpen-Domain Question Answering	CodeCode Available	2

Show:10 25 50

← PrevPage 18 of 433Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified