| DeepSeek-V3 Technical Report | Dec 27, 2024 | GPULanguage Modeling | CodeCode Available | 16 | 5 |
| SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion | Mar 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 15 | 5 |
| Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 14 | 5 |
| SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering | May 6, 2024 | Bug fixingLanguage Modeling | CodeCode Available | 11 | 5 |
| JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Nov 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| Scaling Synthetic Data Creation with 1,000,000,000 Personas | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| TinyLlama: An Open-Source Small Language Model | Jan 4, 2024 | Computational EfficiencyLanguage Modeling | CodeCode Available | 11 | 5 |
| Pixtral 12B | Oct 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models | Feb 25, 2025 | DiversityLanguage Modeling | CodeCode Available | 11 | 5 |
| IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System | Feb 8, 2025 | DecoderLanguage Modeling | CodeCode Available | 11 | 5 |
| The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | Aug 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | Jan 25, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 11 | 5 |
| Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality | May 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| PowerInfer-2: Fast Large Language Model Inference on a Smartphone | Jun 10, 2024 | CPULanguage Modeling | CodeCode Available | 9 | 5 |
| RWKV-7 "Goose" with Expressive Dynamic State Evolution | Mar 18, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 9 | 5 |
| LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Mar 26, 2024 | GPUGSM8K | CodeCode Available | 9 | 5 |
| Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code Understanding | Jul 14, 2025 | Code GenerationLanguage Modeling | CodeCode Available | 9 | 5 |
| OpenELM: An Efficient Language Model Family with Open Training and Inference Framework | Apr 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| OLMo: Accelerating the Science of Language Models | Feb 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild | Mar 25, 2024 | DecoderLanguage Modeling | CodeCode Available | 9 | 5 |
| Moshi: a speech-text foundation model for real-time dialogue | Sep 17, 2024 | Action DetectionActivity Detection | CodeCode Available | 9 | 5 |
| s1: Simple test-time scaling | Jan 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| Natural language guidance of high-fidelity text-to-speech with synthetic annotations | Feb 2, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 9 | 5 |
| VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model | Apr 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| YOLO-World: Real-Time Open-Vocabulary Object Detection | Jan 30, 2024 | Instance SegmentationLanguage Modeling | CodeCode Available | 9 | 5 |
| DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model | May 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| Visually Descriptive Language Model for Vector Graphics Reasoning | Apr 9, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 9 | 5 |
| CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion | May 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model | Jun 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence | Jun 17, 2024 | 16kLanguage Modeling | CodeCode Available | 9 | 5 |
| Arcee's MergeKit: A Toolkit for Merging Large Language Models | Mar 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| Yi: Open Foundation Models by 01.AI | Mar 7, 2024 | AttributeChatbot | CodeCode Available | 9 | 5 |
| Language agents achieve superhuman synthesis of scientific knowledge | Sep 10, 2024 | ArticlesInformation Retrieval | CodeCode Available | 9 | 5 |
| Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models | Apr 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| Perception Encoder: The best visual embeddings are not at the output of the network | Apr 17, 2025 | Depth EstimationLanguage Modeling | CodeCode Available | 8 | 5 |
| Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition | Jul 17, 2023 | DecoderLanguage Modeling | CodeCode Available | 8 | 5 |
| Large Language Model Agent: A Survey on Methodology, Applications and Challenges | Mar 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| AutoTrain: No-code training for state-of-the-art models | Oct 21, 2024 | Classificationimage-classification | CodeCode Available | 7 | 5 |
| AudioLM: a Language Modeling Approach to Audio Generation | Sep 7, 2022 | Audio Generation | CodeCode Available | 7 | 5 |
| Chronos: Learning the Language of Time Series | Mar 12, 2024 | Gaussian ProcessesLanguage Modeling | CodeCode Available | 7 | 5 |
| MagicQuill: An Intelligent Interactive Image Editing System | Nov 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving | Nov 27, 2024 | FairnessGPU | CodeCode Available | 7 | 5 |
| Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model | Jun 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation | Jan 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Dynamic data sampler for cross-language transfer learning in large language models | May 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines | Oct 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Jan 26, 2024 | Code GenerationInstruction Following | CodeCode Available | 7 | 5 |
| Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers | Jan 5, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 7 | 5 |
| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Labeling supervised fine-tuning data with the scaling law | May 5, 2024 | coreference-resolutionCoreference Resolution | CodeCode Available | 7 | 5 |