| TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions | Mar 27, 2024 | Hint GenerationInformation Retrieval | CodeCode Available | 1 |
| RankMamba: Benchmarking Mamba's Document Ranking Performance in the Era of Transformers | Mar 27, 2024 | BenchmarkingDocument Ranking | CodeCode Available | 1 |
| DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification | Mar 24, 2024 | Audio ClassificationInformation Retrieval | CodeCode Available | 1 |
| AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation | Mar 5, 2024 | ArticlesDecision Making | CodeCode Available | 1 |
| Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding | Feb 28, 2024 | document understandingInformation Retrieval | CodeCode Available | 1 |
| Corpus-Steered Query Expansion with Large Language Models | Feb 28, 2024 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey | Feb 27, 2024 | Information RetrievalMusic Generation | CodeCode Available | 1 |
| ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval | Feb 23, 2024 | Cross-Lingual TransferInformation Retrieval | CodeCode Available | 1 |
| Self-Retrieval: End-to-End Information Retrieval with One Large Language Model | Feb 23, 2024 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| INSTRUCTIR: A Benchmark for Instruction Following of Information Retrieval Models | Feb 22, 2024 | Information RetrievalInstruction Following | CodeCode Available | 1 |
| MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music | Feb 15, 2024 | Information RetrievalMusic Information Retrieval | CodeCode Available | 1 |
| ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs | Feb 9, 2024 | Data AugmentationInformation Retrieval | CodeCode Available | 1 |
| The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model | Feb 9, 2024 | Information RetrievalLanguage Modelling | CodeCode Available | 1 |
| Enhancing Complex Question Answering over Knowledge Graphs through Evidence Pattern Retrieval | Feb 3, 2024 | Information RetrievalKnowledge Graphs | CodeCode Available | 1 |
| History-Aware Conversational Dense Retrieval | Jan 30, 2024 | Conversational SearchInformation Retrieval | CodeCode Available | 1 |
| LongHealth: A Question Answering Benchmark with Long Clinical Documents | Jan 25, 2024 | Information RetrievalMultiple-choice | CodeCode Available | 1 |
| SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval | Jan 24, 2024 | BenchmarkingImage Captioning | CodeCode Available | 1 |
| Exploring the Best Practices of Query Expansion with Large Language Models | Jan 12, 2024 | Information RetrievalRe-Ranking | CodeCode Available | 1 |
| Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports | Jan 3, 2024 | Action Understandingcounterfactual | CodeCode Available | 1 |
| CaseGNN: Graph Neural Networks for Legal Case Retrieval with Text-Attributed Graphs | Dec 18, 2023 | Graph AttentionInformation Retrieval | CodeCode Available | 1 |
| Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval | Dec 17, 2023 | ArticlesInformation Retrieval | CodeCode Available | 1 |
| Extending Context Window of Large Language Models via Semantic Compression | Dec 15, 2023 | Few-Shot LearningInformation Retrieval | CodeCode Available | 1 |
| MUST: An Effective and Scalable Framework for Multimodal Search of Target Modality | Dec 11, 2023 | Information Retrieval | CodeCode Available | 1 |
| LLF-Bench: Benchmark for Interactive Learning from Language Feedback | Dec 11, 2023 | Information RetrievalOpenAI Gym | CodeCode Available | 1 |
| mir_ref: A Representation Evaluation Framework for Music Information Retrieval Tasks | Dec 10, 2023 | Information RetrievalMusic Information Retrieval | CodeCode Available | 1 |
| ESPN: Memory-Efficient Multi-Vector Information Retrieval | Dec 9, 2023 | Information RetrievalRe-Ranking | CodeCode Available | 1 |
| A Two-Stage Adaptation of Large Language Models for Text Ranking | Nov 28, 2023 | DecoderInformation Retrieval | CodeCode Available | 1 |
| IterCQR: Iterative Conversational Query Reformulation with Retrieval Guidance | Nov 16, 2023 | Conversational SearchInformation Retrieval | CodeCode Available | 1 |
| Scalable and Effective Generative Information Retrieval | Nov 15, 2023 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Neural Retrievers are Biased Towards LLM-Generated Content | Oct 31, 2023 | Information RetrievalRetrieval | CodeCode Available | 1 |
| Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents | Oct 30, 2023 | Information RetrievalMTEB Benchmark | CodeCode Available | 1 |
| Poisoning Retrieval Corpora by Injecting Adversarial Passages | Oct 29, 2023 | Information RetrievalNatural Questions | CodeCode Available | 1 |
| Attention Lens: A Tool for Mechanistically Interpreting the Attention Head Information Retrieval Mechanism | Oct 25, 2023 | Information RetrievalRetrieval | CodeCode Available | 1 |
| A Comprehensive Python Library for Deep Learning-Based Event Detection in Multivariate Time Series Data and Information Retrieval in NLP | Oct 25, 2023 | Binary ClassificationDeep Learning | CodeCode Available | 1 |
| Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature | Oct 24, 2023 | Abstractive Text SummarizationInformation Retrieval | CodeCode Available | 1 |
| BioImage.IO Chatbot: A Community-Driven AI Assistant for Integrative Computational Bioimaging | Oct 23, 2023 | ChatbotInformation Retrieval | CodeCode Available | 1 |
| Open-source Large Language Models are Strong Zero-shot Query Likelihood Models for Document Ranking | Oct 20, 2023 | Document RankingInformation Retrieval | CodeCode Available | 1 |
| A Comprehensive Evaluation of Large Language Models on Legal Judgment Prediction | Oct 18, 2023 | Information RetrievalLegal Reasoning | CodeCode Available | 1 |
| Leveraging Large Language Models for Node Generation in Few-Shot Learning on Text-Attributed Graphs | Oct 15, 2023 | Few-Shot LearningGraph Learning | CodeCode Available | 1 |
| Language Models As Semantic Indexers | Oct 11, 2023 | Contrastive LearningInformation Retrieval | CodeCode Available | 1 |
| Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators | Oct 11, 2023 | Information RetrievalInformativeness | CodeCode Available | 1 |
| LAiW: A Chinese Legal Large Language Models Benchmark | Oct 9, 2023 | Information Retrieval | CodeCode Available | 1 |
| EMelodyGen: Emotion-Conditioned Melody Generation in ABC Notation with the Musical Feature Template | Sep 23, 2023 | Data AugmentationEmotion Classification | CodeCode Available | 1 |
| Symbolic Music Representations for Classification Tasks: A Systematic Evaluation | Sep 5, 2023 | ClassificationInformation Retrieval | CodeCode Available | 1 |
| YAGO 4.5: A Large and Clean Knowledge Base with a Rich Taxonomy | Aug 23, 2023 | Information RetrievalRetrieval | CodeCode Available | 1 |
| RaLLe: A Framework for Developing and Evaluating Retrieval-Augmented Large Language Models | Aug 21, 2023 | Information RetrievalQuestion Answering | CodeCode Available | 1 |
| Taken by Surprise: Contrast effect for Similarity Scores | Aug 18, 2023 | ClassificationDocument Classification | CodeCode Available | 1 |
| HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution | Jul 31, 2023 | Information RetrievalInformativeness | CodeCode Available | 1 |
| Med-HALT: Medical Domain Hallucination Test for Large Language Models | Jul 28, 2023 | HallucinationInformation Retrieval | CodeCode Available | 1 |
| Zero-note samba: Self-supervised beat tracking | Jul 21, 2023 | Beat TrackingInformation Retrieval | CodeCode Available | 1 |