| OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations | Dec 10, 2024 | AttributeBenchmarking | CodeCode Available | 5 |
| RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation | Aug 15, 2024 | DiagnosticRAG | CodeCode Available | 5 |
| Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG | Jan 15, 2025 | Natural Language UnderstandingRAG | CodeCode Available | 5 |
| Search-o1: Agentic Search-Enhanced Large Reasoning Models | Jan 9, 2025 | Code Generation | CodeCode Available | 5 |
| MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation | Jan 12, 2025 | RAGRetrieval | CodeCode Available | 5 |
| RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism | Jun 30, 2025 | Question AnsweringRAG | CodeCode Available | 5 |
| TrustRAG: An Information Assistant with Retrieval Augmented Generation | Feb 19, 2025 | Answer GenerationChunking | CodeCode Available | 5 |
| Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks | Dec 20, 2024 | AllRAG | CodeCode Available | 5 |
| KBLaM: Knowledge Base augmented Language Model | Oct 14, 2024 | 8kGPU | CodeCode Available | 5 |
| Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks | May 22, 2020 | Fact VerificationQuestion Answering | CodeCode Available | 4 |
| Retrieval-Augmented Generation for Large Language Models: A Survey | Dec 18, 2023 | HallucinationRAG | CodeCode Available | 4 |
| Benchmarking Retrieval-Augmented Generation for Medicine | Feb 20, 2024 | BenchmarkingInformation Retrieval | CodeCode Available | 4 |
| Retrieval-Augmented Generation with Hierarchical Knowledge | Mar 13, 2025 | Multi-hop Question AnsweringQuestion Answering | CodeCode Available | 4 |
| Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation | Feb 4, 2025 | BenchmarkingInformation Retrieval | CodeCode Available | 4 |
| Data-Prep-Kit: getting your data ready for LLM application development | Sep 26, 2024 | CPULanguage Modeling | CodeCode Available | 4 |
| COS-Mix: Cosine Similarity and Distance Fusion for Improved Information Retrieval | Jun 2, 2024 | Information RetrievalRAG | CodeCode Available | 4 |
| Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence | Jul 9, 2024 | Retrieval-augmented Generation | CodeCode Available | 4 |
| ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process Rewarding | Jan 14, 2025 | RAGRetrieval | CodeCode Available | 4 |
| s3: You Don't Need That Much Data to Train a Search Agent via RL | May 20, 2025 | RAGReinforcement Learning (RL) | CodeCode Available | 4 |
| Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization | Apr 2, 2024 | RAGRetrieval | CodeCode Available | 4 |
| R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | May 22, 2025 | MemorizationRAG | CodeCode Available | 4 |
| A Survey of LLM DATA | May 24, 2025 | Large Language ModelManagement | CodeCode Available | 4 |
| Generative Representational Instruction Tuning | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit | May 12, 2025 | GPUPrivacy Preserving | CodeCode Available | 4 |
| Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions | Aug 1, 2024 | Medical Question AnsweringMedQA | CodeCode Available | 4 |