| Melody transcription via generative pre-training | Dec 4, 2022 | Chord RecognitionInformation Retrieval | CodeCode Available | 2 |
| RetroMAE v2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models | Nov 16, 2022 | Dimensionality ReductionInformation Retrieval | CodeCode Available | 2 |
| Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages | Oct 18, 2022 | Information RetrievalRetrieval | CodeCode Available | 2 |
| Multilingual Search with Subword TF-IDF | Sep 28, 2022 | Information RetrievalRetrieval | CodeCode Available | 2 |
| Atlas: Few-shot Learning with Retrieval Augmented Language Models | Aug 5, 2022 | Fact CheckingFew-Shot Learning | CodeCode Available | 2 |
| Infinite Recommendation Networks: A Data-Centric Approach | Jun 3, 2022 | Information RetrievalRecommendation Systems | CodeCode Available | 2 |
| RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder | May 24, 2022 | DecoderInformation Retrieval | CodeCode Available | 2 |
| Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion | May 4, 2022 | Information RetrievalKnowledge Graph Completion | CodeCode Available | 2 |
| Autoregressive Search Engines: Generating Substrings as Document Identifiers | Apr 22, 2022 | Information RetrievalRetrieval | CodeCode Available | 2 |
| Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval | Mar 7, 2022 | Information RetrievalPassage Retrieval | CodeCode Available | 2 |
| SGPT: GPT Sentence Embeddings for Semantic Search | Feb 17, 2022 | Argument RetrievalBiomedical Information Retrieval | CodeCode Available | 2 |
| InPars: Data Augmentation for Information Retrieval using Large Language Models | Feb 10, 2022 | Data AugmentationDiversity | CodeCode Available | 2 |
| ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction | Dec 2, 2021 | Information RetrievalOpen-Domain Question Answering | CodeCode Available | 2 |
| Omnizart: A General Toolbox for Automatic Music Transcription | Jun 1, 2021 | Chord RecognitionDownbeat Tracking | CodeCode Available | 2 |
| FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search | May 20, 2021 | Information RetrievalRetrieval | CodeCode Available | 2 |
| BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models | Apr 17, 2021 | Argument RetrievalBenchmarking | CodeCode Available | 2 |
| Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research with Sparse and Dense Representations | Feb 19, 2021 | Cultural Vocal Bursts Intensity PredictionInformation Retrieval | CodeCode Available | 2 |
| Pretrained Transformers for Text Ranking: BERT and Beyond | Oct 13, 2020 | Information RetrievalReranking | CodeCode Available | 2 |
| GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music | Oct 11, 2020 | Information RetrievalMusic Information Retrieval | CodeCode Available | 2 |
| ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT | Apr 27, 2020 | Document RankingInformation Retrieval | CodeCode Available | 2 |
| OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction | Sep 28, 2019 | Information RetrievalQuestion Answering | CodeCode Available | 2 |
| Multi-Interest Network with Dynamic Routing for Recommendation at Tmall | Apr 17, 2019 | ClusteringInformation Retrieval | CodeCode Available | 2 |
| TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents | Jan 23, 2019 | Dialogue GenerationInformation Retrieval | CodeCode Available | 2 |
| Knowledge Representation Learning: A Quantitative Review | Dec 28, 2018 | General ClassificationInformation Retrieval | CodeCode Available | 2 |
| CheMatAgent: Enhancing LLMs for Chemistry and Materials Science through Tree-Search Based Tool Learning | Jun 9, 2025 | Information Retrieval | CodeCode Available | 1 |
| REARANK: Reasoning Re-ranking Agent via Reinforcement Learning | May 26, 2025 | Data AugmentationInformation Retrieval | CodeCode Available | 1 |
| Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval | May 26, 2025 | Contrastive Learningcross-modal alignment | CodeCode Available | 1 |
| POQD: Performance-Oriented Query Decomposer for Multi-vector retrieval | May 25, 2025 | Information RetrievalRAG | CodeCode Available | 1 |
| DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster Management | May 20, 2025 | Decision MakingInformation Retrieval | CodeCode Available | 1 |
| mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs | May 16, 2025 | Information RetrievalKnowledge Graphs | CodeCode Available | 1 |
| ReCDAP: Relation-Based Conditional Diffusion with Attention Pooling for Few-Shot Knowledge Graph Completion | May 12, 2025 | Information RetrievalKnowledge Graph Completion | CodeCode Available | 1 |
| Exploring _0 Sparsification for Inference-free Sparse Retrievers | Apr 21, 2025 | Computational EfficiencyInformation Retrieval | CodeCode Available | 1 |
| Template-Based Financial Report Generation in Agentic and Decomposed Information Retrieval | Apr 19, 2025 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Building Russian Benchmark for Evaluation of Information Retrieval Models | Apr 17, 2025 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System | Apr 12, 2025 | Information RetrievalRAG | CodeCode Available | 1 |
| Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval | Apr 7, 2025 | Information RetrievalNatural Questions | CodeCode Available | 1 |
| Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking | Apr 4, 2025 | Document RankingInformation Retrieval | CodeCode Available | 1 |
| LIRA: A Learning-based Query-aware Partition Framework for Large-scale ANN Search | Mar 30, 2025 | Information Retrieval | CodeCode Available | 1 |
| Narrative Trails: A Method for Coherent Storyline Extraction via Maximum Capacity Path Optimization | Mar 19, 2025 | Information Retrieval | CodeCode Available | 1 |
| Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization | Feb 27, 2025 | Information RetrievalKnowledge Graphs | CodeCode Available | 1 |
| LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences | Feb 24, 2025 | HallucinationInformation Retrieval | CodeCode Available | 1 |
| Judging the Judges: A Collection of LLM-Generated Relevance Judgements | Feb 19, 2025 | Information Retrieval | CodeCode Available | 1 |
| Towards Text-Image Interleaved Retrieval | Feb 18, 2025 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| FairDiverse: A Comprehensive Toolkit for Fair and Diverse Information Retrieval Algorithms | Feb 17, 2025 | DiversityFairness | CodeCode Available | 1 |
| Syntriever: How to Train Your Retriever with Synthetic Data from LLMs | Feb 6, 2025 | Information Retrieval | CodeCode Available | 1 |
| Scalable-Softmax Is Superior for Attention | Jan 31, 2025 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| TFLOP: Table Structure Recognition Framework with Layout Pointer Mechanism | Jan 21, 2025 | Information Retrieval | CodeCode Available | 1 |
| MechIR: A Mechanistic Interpretability Framework for Information Retrieval | Jan 17, 2025 | DiagnosticInformation Retrieval | CodeCode Available | 1 |
| kANNolo: Sweet and Smooth Approximate k-Nearest Neighbors Search | Jan 10, 2025 | Information RetrievalQuantization | CodeCode Available | 1 |
| Length-Aware DETR for Robust Moment Retrieval | Dec 30, 2024 | Information RetrievalMoment Retrieval | CodeCode Available | 1 |