| DeepSeek-V3 Technical Report | Dec 27, 2024 | GPULanguage Modeling | CodeCode Available | 16 | 5 |
| SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion | Mar 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 15 | 5 |
| Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 14 | 5 |
| Pixtral 12B | Oct 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering | May 6, 2024 | Bug fixingLanguage Modeling | CodeCode Available | 11 | 5 |
| IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System | Feb 8, 2025 | DecoderLanguage Modeling | CodeCode Available | 11 | 5 |
| Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality | May 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | Aug 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| Scaling Synthetic Data Creation with 1,000,000,000 Personas | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models | Feb 25, 2025 | DiversityLanguage Modeling | CodeCode Available | 11 | 5 |
| TinyLlama: An Open-Source Small Language Model | Jan 4, 2024 | Computational EfficiencyLanguage Modeling | CodeCode Available | 11 | 5 |
| DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | Jan 25, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 11 | 5 |
| JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Nov 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| PowerInfer-2: Fast Large Language Model Inference on a Smartphone | Jun 10, 2024 | CPULanguage Modeling | CodeCode Available | 9 | 5 |
| s1: Simple test-time scaling | Jan 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models | Apr 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| OpenELM: An Efficient Language Model Family with Open Training and Inference Framework | Apr 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| RWKV-7 "Goose" with Expressive Dynamic State Evolution | Mar 18, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 9 | 5 |
| Language agents achieve superhuman synthesis of scientific knowledge | Sep 10, 2024 | ArticlesInformation Retrieval | CodeCode Available | 9 | 5 |
| Natural language guidance of high-fidelity text-to-speech with synthetic annotations | Feb 2, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 9 | 5 |
| CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion | May 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence | Jun 17, 2024 | 16kLanguage Modeling | CodeCode Available | 9 | 5 |
| Moshi: a speech-text foundation model for real-time dialogue | Sep 17, 2024 | Action DetectionActivity Detection | CodeCode Available | 9 | 5 |
| OLMo: Accelerating the Science of Language Models | Feb 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code Understanding | Jul 14, 2025 | Code GenerationLanguage Modeling | CodeCode Available | 9 | 5 |