| DeepSeek-V3 Technical Report | Dec 27, 2024 | GPULanguage Modeling | CodeCode Available | 16 |
| SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion | Mar 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 15 |
| Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 14 |
| olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models | Feb 25, 2025 | DiversityLanguage Modeling | CodeCode Available | 11 |
| IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System | Feb 8, 2025 | DecoderLanguage Modeling | CodeCode Available | 11 |
| JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Nov 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 |
| Pixtral 12B | Oct 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 |
| The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | Aug 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 |
| Scaling Synthetic Data Creation with 1,000,000,000 Personas | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 |
| Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality | May 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 |