| OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations | Dec 10, 2024 | AttributeBenchmarking | CodeCode Available | 5 |
| KBLaM: Knowledge Base augmented Language Model | Oct 14, 2024 | 8kGPU | CodeCode Available | 5 |
| RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation | Aug 15, 2024 | DiagnosticRAG | CodeCode Available | 5 |
| Retrieval-Augmented Generation for AI-Generated Content: A Survey | Feb 29, 2024 | Information RetrievalLarge Language Model | CodeCode Available | 5 |
| A Survey of LLM DATA | May 24, 2025 | Large Language ModelManagement | CodeCode Available | 4 |
| SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis | May 22, 2025 | DiversityInformation Retrieval | CodeCode Available | 4 |
| R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | May 22, 2025 | MemorizationRAG | CodeCode Available | 4 |
| s3: You Don't Need That Much Data to Train a Search Agent via RL | May 20, 2025 | RAGReinforcement Learning (RL) | CodeCode Available | 4 |
| OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit | May 12, 2025 | GPUPrivacy Preserving | CodeCode Available | 4 |
| DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments | Apr 4, 2025 | NavigatePrompt Engineering | CodeCode Available | 4 |