Information Retrieval

Information retrieval is the task of ranking a list of documents or search results in response to a query

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 4740 papers

Title	Date	Tasks	Status	Hype	Score
LightRAG: Simple and Fast Retrieval-Augmented Generation	Oct 8, 2024	Information RetrievalRAG	CodeCode Available	14	5
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher	Jul 29, 2024	2D Semantic Segmentation task 1 (8 classes)graph construction	CodeCode Available	9	5
Language agents achieve superhuman synthesis of scientific knowledge	Sep 10, 2024	ArticlesInformation Retrieval	CodeCode Available	9	5
PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods	Jul 9, 2024	Information RetrievalLEMMA	CodeCode Available	7	5
Retrieval-Augmented Generation for AI-Generated Content: A Survey	Feb 29, 2024	Information RetrievalLarge Language Model	CodeCode Available	5	5
Make Your LLM Fully Utilize the Context	Apr 25, 2024	4kInformation Retrieval	CodeCode Available	5	5
Benchmarking the Myopic Trap: Positional Bias in Information Retrieval	May 20, 2025	BenchmarkingInformation Retrieval	CodeCode Available	5	5
Extreme Compression of Large Language Models via Additive Quantization	Jan 11, 2024	CPUGPU	CodeCode Available	5	5
Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard	Jun 13, 2023	Information RetrievalRepresentation Learning	CodeCode Available	4	5
SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes	Feb 13, 2023	Information RetrievalRetrieval	CodeCode Available	4	5
PLAID: An Efficient Engine for Late Interaction Retrieval	May 19, 2022	CPUGPU	CodeCode Available	4	5
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents	Jun 13, 2025	Information RetrievalRetrieval	CodeCode Available	4	5
From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents	Jun 23, 2025	Information RetrievalRetrieval	CodeCode Available	4	5
COS-Mix: Cosine Similarity and Distance Fusion for Improved Information Retrieval	Jun 2, 2024	Information RetrievalRAG	CodeCode Available	4	5
DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement Learning	Feb 28, 2025	Information Retrievalreinforcement-learning	CodeCode Available	4	5
One Embedder, Any Task: Instruction-Finetuned Text Embeddings	Dec 19, 2022	Information RetrievalLearning Word Embeddings	CodeCode Available	4	5
Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models	Mar 11, 2025	FormInformation Retrieval	CodeCode Available	4	5
iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models	Sep 5, 2024	Few-Shot LearningInformation Retrieval	CodeCode Available	4	5
Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation	Aug 8, 2024	ChunkingFact Checking	CodeCode Available	4	5
Benchmarking Retrieval-Augmented Generation for Medicine	Feb 20, 2024	BenchmarkingInformation Retrieval	CodeCode Available	4	5
ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge	Mar 24, 2023	Information RetrievalLanguage Modeling	CodeCode Available	4	5
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation	Feb 4, 2025	BenchmarkingInformation Retrieval	CodeCode Available	4	5
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators	Mar 29, 2023	Information RetrievalRetrieval	CodeCode Available	4	5
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis	May 22, 2025	DiversityInformation Retrieval	CodeCode Available	4	5
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function	May 26, 2023	Fact VerificationInformation Retrieval	CodeCode Available	4	5
MTEB: Massive Text Embedding Benchmark	Oct 13, 2022	BenchmarkingInformation Retrieval	CodeCode Available	4	5
From Matching to Generation: A Survey on Generative Information Retrieval	Apr 23, 2024	Incremental LearningInformation Retrieval	CodeCode Available	3	5
When Large Language Models Meet Vector Databases: A Survey	Jan 30, 2024	HallucinationInformation Retrieval	CodeCode Available	3	5
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia	May 23, 2023	ChatbotHallucination	CodeCode Available	3	5
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective	Jul 9, 2024	Information RetrievalRetrieval	CodeCode Available	3	5
Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search	May 21, 2025	Information Retrieval	CodeCode Available	3	5
Dataset and Baseline System for Multi-lingual Extraction and Normalization of Temporal and Numerical Expressions	Mar 31, 2023	Date UnderstandingInformation Retrieval	CodeCode Available	3	5
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation	Mar 8, 2024	Code GenerationHallucination	CodeCode Available	3	5
Music2Latent: Consistency Autoencoders for Latent Audio Compression	Aug 12, 2024	Audio CompressionInformation Retrieval	CodeCode Available	3	5
BMX: Entropy-weighted Similarity and Semantic-enhanced Lexical Search	Aug 13, 2024	Information RetrievalRetrieval	CodeCode Available	3	5
Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval	Feb 17, 2025	Information RetrievalRetrieval	CodeCode Available	3	5
REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites	Apr 15, 2025	Autonomous Web NavigationBenchmarking	CodeCode Available	3	5
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers	May 26, 2025	Information Retrieval	CodeCode Available	3	5
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning	Jan 12, 2024	Diversitydocument understanding	CodeCode Available	3	5
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications	Jun 14, 2025	Information RetrievalSurvey	CodeCode Available	3	5
MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels	May 13, 2024	Information RetrievalRetrieval	CodeCode Available	3	5
ReasonIR: Training Retrievers for Reasoning Tasks	Apr 29, 2025	Information RetrievalMMLU	CodeCode Available	3	5
FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search	May 20, 2021	Information RetrievalRetrieval	CodeCode Available	2	5
AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark	Dec 17, 2024	Information RetrievalRetrieval	CodeCode Available	2	5
FIRST: Faster Improved Listwise Reranking with Single Token Decoding	Jun 21, 2024	Information RetrievalLanguage Modeling	CodeCode Available	2	5
FinBERT-QA: Financial Question Answering with pre-trained BERT Language Models	Apr 24, 2025	Answer SelectionInformation Retrieval	CodeCode Available	2	5
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions	Mar 22, 2024	Information RetrievalRetrieval	CodeCode Available	2	5
Evaluation of Retrieval-Augmented Generation: A Survey	May 13, 2024	Information RetrievalRAG	CodeCode Available	2	5
Eureka: Evaluating and Understanding Large Foundation Models	Sep 13, 2024	Information Retrieval	CodeCode Available	2	5
A Foundation Model for Music Informatics	Nov 6, 2023	Information Retrievalmodel	CodeCode Available	2	5

Show:10 25 50

← PrevPage 1 of 95Next →

All datasets BSARD MS MARCO CQADupStack TREC-PM Amazon MSLR WEB30K MSMARCO MTEB News Headlines Ohsumed

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Two-tower Bi-Encoder (RoBERTa)	Recall@100	74.78	—	Unverified
2	Siamese Bi-Encoder (RoBERTa)	Recall@100	71.63	—	Unverified
3	BM25	Recall@100	51.33	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RetroMAE v2	MRR@10	42.58	—	Unverified
2	ConAE-256	Time (ms)	0.33	—	Unverified
3	ConAE-128	Time (ms)	0.32	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SGPT-BE-5.8B	mAP@100	0.16	—	Unverified
2	TSDAE	mAP@100	0.15	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	hpipubcommon	infNDCG	0.56	—	Unverified
2	hpictall	infNDCG	0.55	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MIND	HR@30	0.32	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Distilled Network	nDCG@10	0.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RetroMAE	MRR@10	0.42	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SGPT-5.8B-msmarco	nDCG@10	50.25	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Information Retrieval + SVM	1:1 Accuracy	83.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT+CONCEPT FILTER	NDCG	0.25	—	Unverified