Retrieval

A methodology that involves selecting relevant data or examples from a large dataset to support tasks like prediction, learning, or inference. It enhances models by providing context or additional information, often used in systems like retrieval-augmented generation or in-context learning.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–250 of 14297 papers

Title	Date	Tasks	Status	Hype
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning	Jan 12, 2024	Diversitydocument understanding	CodeCode Available	3
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases	Apr 19, 2024	BenchmarkingRetrieval	CodeCode Available	3
InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval	Jan 4, 2023	Information RetrievalRetrieval	CodeCode Available	2
Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts	Feb 24, 2025	BenchmarkingFact Verification	CodeCode Available	2
InPars: Data Augmentation for Information Retrieval using Large Language Models	Feb 10, 2022	Data AugmentationDiversity	CodeCode Available	2
InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval	Jul 10, 2023	GPUInformation Retrieval	CodeCode Available	2
INQUIRE: A Natural World Text-to-Image Retrieval Benchmark	Nov 4, 2024	Image RetrievalReranking	CodeCode Available	2
Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models	Jan 27, 2024	Medical Question AnsweringMultiple-choice	CodeCode Available	2
Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing	Jul 1, 2024	DenoisingImage Restoration	CodeCode Available	2
Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning	Jan 25, 2025	Answer GenerationMulti-agent Reinforcement Learning	CodeCode Available	2
Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion	May 4, 2022	Information RetrievalKnowledge Graph Completion	CodeCode Available	2
In-Context Retrieval-Augmented Language Models	Jan 31, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales	Jun 19, 2024	DenoisingIn-Context Learning	CodeCode Available	2
HourVideo: 1-Hour Video-Language Understanding	Nov 7, 2024	Benchmarkingcounterfactual	CodeCode Available	2
Hopfield Networks is All You Need	Jul 16, 2020	AllDrug Design	CodeCode Available	2
How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions	Jul 6, 2024	Question AnsweringRAG	CodeCode Available	2
Autonomous GIS: the next-generation AI-powered GIS	May 10, 2023	Code GenerationInformation Retrieval	CodeCode Available	2
AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML	Oct 3, 2024	AutoMLCode Generation	CodeCode Available	2
HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation	Apr 13, 2025	Multimodal ReasoningRAG	CodeCode Available	2
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation	May 22, 2024	InformativenessLanguage Modeling	CodeCode Available	2
Grounding Language Models to Images for Multimodal Inputs and Outputs	Jan 31, 2023	Image RetrievalIn-Context Learning	CodeCode Available	2
Global Features are All You Need for Image Retrieval and Reranking	Aug 14, 2023	AllImage Retrieval	CodeCode Available	2
GLAP: General contrastive audio-text pretraining across domains and languages	Jun 12, 2025	AudioCapsKeyword Spotting	CodeCode Available	2
GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music	Oct 11, 2020	Information RetrievalMusic Information Retrieval	CodeCode Available	2
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval	Jul 17, 2024	DecoderImage Enhancement	CodeCode Available	2
Hello Again! LLM-powered Personalized Agent for Long-term Dialogue	Jun 9, 2024	Response GenerationRetrieval	CodeCode Available	2
Huatuo-26M, a Large-scale Chinese Medical QA Dataset	May 2, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
Interactive Continual Learning: Fast and Slow Thinking	Mar 5, 2024	Continual LearningOutlier Detection	CodeCode Available	2
Generalized Contrastive Learning for Multi-Modal Retrieval and Ranking	Apr 12, 2024	Contrastive LearningRetrieval	CodeCode Available	2
Generating Benchmarks for Factuality Evaluation of Language Models	Jul 13, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information	Apr 19, 2023	In-Context LearningRetrieval	CodeCode Available	2
Generating Images with Multimodal Language Models	May 26, 2023	DecoderImage Generation	CodeCode Available	2
VeCLIP: Improving CLIP Training via Visual-enriched Captions	Oct 11, 2023	Image-text RetrievalRetrieval	CodeCode Available	2
FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search	May 20, 2021	Information RetrievalRetrieval	CodeCode Available	2
FLAIR: VLM with Fine-grained Language-informed Image Representations	Dec 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering	Sep 29, 2023	Image to textPassage Retrieval	CodeCode Available	2
Autoregressive Search Engines: Generating Substrings as Document Identifiers	Apr 22, 2022	Information RetrievalRetrieval	CodeCode Available	2
Fine-grained Image Captioning with CLIP Reward	May 26, 2022	Caption GenerationDescriptive	CodeCode Available	2
Flow-Guided Transformer for Video Inpainting	Aug 14, 2022	RetrievalVideo Inpainting	CodeCode Available	2
A Survey of Personalization: From RAG to Agent	Apr 14, 2025	RAGRetrieval	CodeCode Available	2
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs	May 16, 2025	Retrieval	CodeCode Available	2
AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models	Nov 28, 2024	Audio captioningAudio to Text Retrieval	CodeCode Available	2
Backtracing: Retrieving the Cause of the Query	Mar 6, 2024	Information RetrievalLanguage Modeling	CodeCode Available	2
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models	Apr 17, 2021	Argument RetrievalBenchmarking	CodeCode Available	2
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval	Jan 2, 2021	Claim VerificationQuestion Answering	CodeCode Available	2
BEBLID: Boosted efficient binary local image descriptor	Feb 7, 2024	Computational EfficiencyRetrieval	CodeCode Available	2
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval	Mar 7, 2025	Information RetrievalLanguage Modeling	CodeCode Available	2
Benchmarking Large Language Models in Retrieval-Augmented Generation	Sep 4, 2023	Benchmarkingcounterfactual	CodeCode Available	2
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions	Nov 9, 2023	HallucinationInformation Retrieval	CodeCode Available	2
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions	Mar 22, 2024	Information RetrievalRetrieval	CodeCode Available	2

Show:10 25 50

← PrevPage 5 of 286Next →

All datasets Quora Question Pairs HotpotQA Natural Questions OK-VQA InfoSeek MVK Polyvore PubMedQA PubMedQA corpus with metadata ToolLens คลิปไวรัล!! ไอซ์ ปรีชญา ลืมปิดไลฟ์สดตอนอาบน้ำ ถูกแชร์กระหึ่มเน็ต

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	BM25S	Queries per second	183.53	—	Unverified
2	Elasticsearch	Queries per second	21.8	—	Unverified
3	BM25-PT	Queries per second	6.49	—	Unverified
4	Rank-BM25	Queries per second	1.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BM25S	Queries per second	20.88	—	Unverified
2	Elasticsearch	Queries per second	7.11	—	Unverified
3	Rank-BM25	Queries per second	0.04	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BM25S	Queries per second	41.85	—	Unverified
2	Elasticsearch	Queries per second	12.16	—	Unverified
3	Rank-BM25	Queries per second	0.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	FLMR	Recall@5	89.32	—	Unverified
2	RA-VQA	Recall@5	82.84	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PreFLMR	Recall@5	62.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CLIP-KIS	text-to-video Mean Rank	30	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CLIP4Outfit	Recall@5	7.59	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MetaGen Blended RAG	Accuracy (Top-1)	82.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MetaGen Blended RAG	Accuracy (Top-1)	82.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	COLT	COMP@	84.55	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	hello	0L	1,121,222	—	Unverified