Information Retrieval

Information retrieval is the task of ranking a list of documents or search results in response to a query

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 4740 papers

Title	Date	Tasks	Status	Hype
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models	Oct 17, 2024	Contrastive LearningDiversity	CodeCode Available	2
Omnizart: A General Toolbox for Automatic Music Transcription	Jun 1, 2021	Chord RecognitionDownbeat Tracking	CodeCode Available	2
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models	Jun 17, 2024	Benchmarking	CodeCode Available	2
OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction	Sep 28, 2019	Information RetrievalQuestion Answering	CodeCode Available	2
CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs	Aug 29, 2023	CPUGPU	CodeCode Available	2
Multilingual Search with Subword TF-IDF	Sep 28, 2022	Information RetrievalRetrieval	CodeCode Available	2
Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research with Sparse and Dense Representations	Feb 19, 2021	Cultural Vocal Bursts Intensity PredictionInformation Retrieval	CodeCode Available	2
Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions	Mar 1, 2025	Information RetrievalRAG	CodeCode Available	2
Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval	Mar 7, 2022	Information RetrievalPassage Retrieval	CodeCode Available	2
CoIR: A Comprehensive Benchmark for Code Information Retrieval Models	Jul 3, 2024	BenchmarkingCode Search	CodeCode Available	2
Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers	Mar 22, 2024	Information Retrieval	CodeCode Available	2
RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!	Dec 5, 2023	Information RetrievalReranking	CodeCode Available	2
Multi-Interest Network with Dynamic Routing for Recommendation at Tmall	Apr 17, 2019	ClusteringInformation Retrieval	CodeCode Available	2
Mustango: Toward Controllable Text-to-Music Generation	Nov 14, 2023	Data AugmentationDenoising	CodeCode Available	2
PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents	Jul 12, 2024	Information RetrievalQuestion Answering	CodeCode Available	2
Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages	Oct 18, 2022	Information RetrievalRetrieval	CodeCode Available	2
SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking	Mar 2, 2025	Fact CheckingFact Verification	CodeCode Available	2
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models	Apr 17, 2021	Argument RetrievalBenchmarking	CodeCode Available	2
Language Model Powered Digital Biology with BRAD	Sep 4, 2024	ChatbotCode Generation	CodeCode Available	2
Lightning IR: Straightforward Fine-tuning and Inference of Transformer-based Language Models for Information Retrieval	Nov 7, 2024	Information RetrievalRe-Ranking	CodeCode Available	2
Melody transcription via generative pre-training	Dec 4, 2022	Chord RecognitionInformation Retrieval	CodeCode Available	2
Knowledge Representation Learning: A Quantitative Review	Dec 28, 2018	General ClassificationInformation Retrieval	CodeCode Available	2
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents	Apr 19, 2023	Information RetrievalPassage Ranking	CodeCode Available	2
LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant	Dec 2, 2024	Contrastive LearningInformation Retrieval	CodeCode Available	2
InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval	Jul 10, 2023	GPUInformation Retrieval	CodeCode Available	2
BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics	Dec 12, 2023	Information RetrievalRepresentation Learning	CodeCode Available	2
Autonomous GIS: the next-generation AI-powered GIS	May 10, 2023	Code GenerationInformation Retrieval	CodeCode Available	2
Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion	May 4, 2022	Information RetrievalKnowledge Graph Completion	CodeCode Available	2
InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval	Jan 4, 2023	Information RetrievalRetrieval	CodeCode Available	2
Large Language Models for Information Retrieval: A Survey	Aug 14, 2023	Information RetrievalQuestion Answering	CodeCode Available	2
Infinite Recommendation Networks: A Data-Centric Approach	Jun 3, 2022	Information RetrievalRecommendation Systems	CodeCode Available	2
InPars: Data Augmentation for Information Retrieval using Large Language Models	Feb 10, 2022	Data AugmentationDiversity	CodeCode Available	2
MemLong: Memory-Augmented Retrieval for Long Text Modeling	Aug 30, 2024	4kDecoder	CodeCode Available	2
FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search	May 20, 2021	Information RetrievalRetrieval	CodeCode Available	2
FIRST: Faster Improved Listwise Reranking with Single Token Decoding	Jun 21, 2024	Information RetrievalLanguage Modeling	CodeCode Available	2
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval	Mar 7, 2025	Information RetrievalLanguage Modeling	CodeCode Available	2
Autoregressive Search Engines: Generating Substrings as Document Identifiers	Apr 22, 2022	Information RetrievalRetrieval	CodeCode Available	2
Backtracing: Retrieving the Cause of the Query	Mar 6, 2024	Information RetrievalLanguage Modeling	CodeCode Available	2
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions	Mar 22, 2024	Information RetrievalRetrieval	CodeCode Available	2
GENIUS: A Generative Framework for Universal Multimodal Search	Mar 25, 2025	Information RetrievalQuantization	CodeCode Available	2
Atlas: Few-shot Learning with Retrieval Augmented Language Models	Aug 5, 2022	Fact CheckingFew-Shot Learning	CodeCode Available	2
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions	Nov 9, 2023	HallucinationInformation Retrieval	CodeCode Available	2
FinBERT-QA: Financial Question Answering with pre-trained BERT Language Models	Apr 24, 2025	Answer SelectionInformation Retrieval	CodeCode Available	2
MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval	Jul 2, 2023	Biomedical Information RetrievalContrastive Learning	CodeCode Available	2
GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music	Oct 11, 2020	Information RetrievalMusic Information Retrieval	CodeCode Available	2
Differential Transformer	Oct 7, 2024	HallucinationIn-Context Learning	CodeCode Available	2
All-In-One Metrical And Functional Structure Analysis With Neighborhood Attentions on Demixed Audio	Jul 31, 2023	AllDownbeat Tracking	CodeCode Available	2
AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval	Apr 9, 2024	AllInformation Retrieval	CodeCode Available	2
AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark	Dec 17, 2024	Information RetrievalRetrieval	CodeCode Available	2
Eureka: Evaluating and Understanding Large Foundation Models	Sep 13, 2024	Information Retrieval	CodeCode Available	2

Show:10 25 50

← PrevPage 2 of 95Next →

All datasets BSARD MS MARCO CQADupStack TREC-PM Amazon MSLR WEB30K MSMARCO MTEB News Headlines Ohsumed

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Two-tower Bi-Encoder (RoBERTa)	Recall@100	74.78	—	Unverified
2	Siamese Bi-Encoder (RoBERTa)	Recall@100	71.63	—	Unverified
3	BM25	Recall@100	51.33	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RetroMAE v2	MRR@10	42.58	—	Unverified
2	ConAE-256	Time (ms)	0.33	—	Unverified
3	ConAE-128	Time (ms)	0.32	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SGPT-BE-5.8B	mAP@100	0.16	—	Unverified
2	TSDAE	mAP@100	0.15	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	hpipubcommon	infNDCG	0.56	—	Unverified
2	hpictall	infNDCG	0.55	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MIND	HR@30	0.32	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Distilled Network	nDCG@10	0.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RetroMAE	MRR@10	0.42	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SGPT-5.8B-msmarco	nDCG@10	50.25	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Information Retrieval + SVM	1:1 Accuracy	83.79	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT+CONCEPT FILTER	NDCG	0.25	—	Unverified