| DebateSum: A large-scale argument mining and summarization dataset | Nov 14, 2020 | Abstractive Text SummarizationArgument Mining | CodeCode Available | 1 |
| ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs | Feb 9, 2024 | Data AugmentationInformation Retrieval | CodeCode Available | 1 |
| Decomposed Prompting: A Modular Approach for Solving Complex Tasks | Oct 5, 2022 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Extending Context Window of Large Language Models via Semantic Compression | Dec 15, 2023 | Few-Shot LearningInformation Retrieval | CodeCode Available | 1 |
| FairDiverse: A Comprehensive Toolkit for Fair and Diverse Information Retrieval Algorithms | Feb 17, 2025 | DiversityFairness | CodeCode Available | 1 |
| Fast k-NN Graph Construction by GPU based NN-Descent | Oct 30, 2021 | CPUGPU | CodeCode Available | 1 |
| DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions | May 26, 2023 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Dealing with Typos for BERT-based Passage Retrieval and Ranking | Aug 27, 2021 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| Deep Declarative Dynamic Time Warping for End-to-End Learning of Alignment Paths | Mar 19, 2023 | Dynamic Time WarpingInformation Retrieval | CodeCode Available | 1 |
| Cross-Thought for Sentence Encoder Pre-training | Oct 7, 2020 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| Cross-domain Retrieval in the Legal and Patent Domains: a Reproducibility Study | Dec 21, 2020 | Information RetrievalLanguage Modelling | CodeCode Available | 1 |
| CSFCube -- A Test Collection of Computer Science Research Articles for Faceted Query by Example | Mar 24, 2021 | ArticlesInformation Retrieval | CodeCode Available | 1 |
| CoRT: Complementary Rankings from Transformers | Oct 20, 2020 | Information RetrievalPassage Retrieval | CodeCode Available | 1 |
| CREPE: A Convolutional Representation for Pitch Estimation | Feb 17, 2018 | Information RetrievalMusic Information Retrieval | CodeCode Available | 1 |
| GitTables: A Large-Scale Corpus of Relational Tables | Jun 14, 2021 | Information RetrievalTable annotation | CodeCode Available | 1 |
| GPU-based Private Information Retrieval for On-Device Machine Learning Inference | Jan 26, 2023 | CPUGPU | CodeCode Available | 1 |
| Grep-BiasIR: A Dataset for Investigating Gender Representation-Bias in Information Retrieval Results | Jan 19, 2022 | Information RetrievalRetrieval | CodeCode Available | 1 |
| C-STS: Conditional Semantic Textual Similarity | May 24, 2023 | Information RetrievalLanguage Model Evaluation | CodeCode Available | 1 |
| Hengam: An Adversarially Trained Transformer for Persian Temporal Tagging | Nov 20, 2022 | Information RetrievalNamed Entity Recognition (NER) | CodeCode Available | 1 |
| Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding | Feb 28, 2024 | document understandingInformation Retrieval | CodeCode Available | 1 |
| Deeper Convolutional Neural Networks and Broad Augmentation Policies Improve Performance in Musical Key Estimation | Nov 7, 2021 | image-classificationInformation Retrieval | CodeCode Available | 1 |
| HPI-DHC at TREC 2018 Precision Medicine Track | Nov 14, 2018 | ArticlesDocument Classification | CodeCode Available | 1 |
| Distilling Knowledge from Reader to Retriever for Question Answering | Dec 8, 2020 | Information RetrievalKnowledge Distillation | CodeCode Available | 1 |
| Conversational Document Prediction to Assist Customer Care Agents | Oct 5, 2020 | Information RetrievalPrediction | CodeCode Available | 1 |
| Conversational Entity Linking: Problem Definition and Datasets | May 11, 2021 | Entity LinkingInformation Retrieval | CodeCode Available | 1 |
| Contextualized Sparse Representations for Real-Time Open-Domain Question Answering | Nov 7, 2019 | Information RetrievalOpen-Domain Question Answering | CodeCode Available | 1 |
| A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search | Jan 29, 2021 | Information RetrievalRecommendation Systems | CodeCode Available | 1 |
| Contrastive Audio-Language Learning for Music | Aug 25, 2022 | Audio to Text RetrievalDescriptive | CodeCode Available | 1 |
| Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates | Nov 14, 2022 | AudioCapsAudio captioning | CodeCode Available | 1 |
| Conversational Question Answering over Passages by Leveraging Word Proximity Networks | Apr 27, 2020 | Conversational Question AnsweringInformation Retrieval | CodeCode Available | 1 |
| A multi-task semi-supervised framework for Text2Graph & Graph2Text | Feb 12, 2022 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents | Oct 30, 2023 | Information RetrievalMTEB Benchmark | CodeCode Available | 1 |
| Judging the Judges: A Collection of LLM-Generated Relevance Judgements | Feb 19, 2025 | Information Retrieval | CodeCode Available | 1 |
| kANNolo: Sweet and Smooth Approximate k-Nearest Neighbors Search | Jan 10, 2025 | Information RetrievalQuantization | CodeCode Available | 1 |
| A Comprehensive Python Library for Deep Learning-Based Event Detection in Multivariate Time Series Data and Information Retrieval in NLP | Oct 25, 2023 | Binary ClassificationDeep Learning | CodeCode Available | 1 |
| ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering | Jun 7, 2024 | Information RetrievalQuestion Answering | CodeCode Available | 1 |
| Adaptive Machine Translation with Large Language Models | Jan 30, 2023 | DecoderDomain Adaptation | CodeCode Available | 1 |
| ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval | Feb 23, 2024 | Cross-Lingual TransferInformation Retrieval | CodeCode Available | 1 |
| A Comprehensive Evaluation of Large Language Models on Legal Judgment Prediction | Oct 18, 2023 | Information RetrievalLegal Reasoning | CodeCode Available | 1 |
| Complex Knowledge Base Question Answering: A Survey | Aug 15, 2021 | Information RetrievalKnowledge Base Question Answering | CodeCode Available | 1 |
| Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval | Oct 12, 2021 | ClusteringConstrained Clustering | CodeCode Available | 1 |
| Learning Hierarchical Metrical Structure Beyond Measures | Sep 21, 2022 | Information RetrievalMusic Information Retrieval | CodeCode Available | 1 |
| CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval | Sep 5, 2022 | Cross-lingual Fact-checkingCross-Lingual Information Retrieval | CodeCode Available | 1 |
| CORD-19: The COVID-19 Open Research Dataset | Apr 22, 2020 | Information RetrievalManagement | CodeCode Available | 1 |
| Learning To Generate Piano Music With Sustain Pedals | Nov 1, 2021 | DecoderInformation Retrieval | CodeCode Available | 1 |
| Learning To Retrieve: How to Train a Dense Retrieval Model Effectively and Efficiently | Oct 20, 2020 | Information RetrievalPassage Retrieval | CodeCode Available | 1 |
| A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems | Apr 17, 2022 | Information RetrievalMusic Information Retrieval | CodeCode Available | 1 |
| Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval | Apr 7, 2025 | Information RetrievalNatural Questions | CodeCode Available | 1 |
| LLF-Bench: Benchmark for Interactive Learning from Language Feedback | Dec 11, 2023 | Information RetrievalOpenAI Gym | CodeCode Available | 1 |
| COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List | Apr 15, 2021 | Information RetrievalRetrieval | CodeCode Available | 1 |