| Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models | Mar 12, 2025 | DenoisingLanguage Modeling | CodeCode Available | 4 |
| R1-Onevision:An Open-Source Multimodal Large Language Model Capable of Deep Reasoning | Feb 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |
| Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM | Feb 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach | Feb 7, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models | Jan 31, 2025 | Caption GenerationLanguage Modeling | CodeCode Available | 4 |
| Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment | Jan 16, 2025 | Causal Inferencecounterfactual | CodeCode Available | 4 |
| Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding | Jan 14, 2025 | Embodied Question AnsweringHallucination | CodeCode Available | 4 |
| Training Software Engineering Agents and Verifiers with SWE-Gym | Dec 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LLM4AD: A Platform for Algorithm Design with Large Language Model | Dec 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator | Dec 16, 2024 | GSM8KLanguage Modeling | CodeCode Available | 4 |
| Gated Delta Networks: Improving Mamba2 with Delta Rule | Dec 9, 2024 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 4 |
| Liquid: Language Models are Scalable Multi-modal Generators | Dec 5, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation | Nov 7, 2024 | Contrastive LearningImage Captioning | CodeCode Available | 4 |
| MutaPLM: Protein Language Modeling for Mutation Explanation and Engineering | Oct 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| SNAC: Multi-Scale Neural Audio Codec | Oct 18, 2024 | Audio CompressionAudio Generation | CodeCode Available | 4 |
| Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration | Oct 3, 2024 | DiversityLanguage Modeling | CodeCode Available | 4 |
| Data-Prep-Kit: getting your data ready for LLM application development | Sep 26, 2024 | CPULanguage Modeling | CodeCode Available | 4 |
| Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding | Sep 22, 2024 | Anomaly DetectionGPU | CodeCode Available | 4 |
| Large Language Model-Based Agents for Software Engineering: A Survey | Sep 4, 2024 | AI AgentLanguage Modeling | CodeCode Available | 4 |
| OLMoE: Open Mixture-of-Experts Language Models | Sep 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search | Aug 15, 2024 | Automated Theorem ProvingLanguage Modeling | CodeCode Available | 4 |
| Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation | Aug 8, 2024 | ChunkingFact Checking | CodeCode Available | 4 |
| The Llama 3 Herd of Models | Jul 31, 2024 | answerability predictionLanguage Modeling | CodeCode Available | 4 |
| When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments | Jul 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |