| MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark | Jun 5, 2025 | RhythmSpoken Language Understanding | CodeCode Available | 7 |
| Pre^3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation | Jun 4, 2025 | | CodeCode Available | 7 |
| OpenThoughts: Data Recipes for Reasoning Models | Jun 4, 2025 | Math | CodeCode Available | 7 |
| AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning | May 30, 2025 | GPUMath | CodeCode Available | 7 |
| Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation | May 28, 2025 | Human AnimationInstruction Following | CodeCode Available | 7 |
| HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer | May 28, 2025 | Image GenerationMixture-of-Experts | CodeCode Available | 7 |
| Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers | May 27, 2025 | | CodeCode Available | 7 |
| SageAttention2++: A More Efficient Implementation of SageAttention2 | May 27, 2025 | QuantizationVideo Generation | CodeCode Available | 7 |
| HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters | May 26, 2025 | Human Animation | CodeCode Available | 7 |
| SEW: Self-Evolving Agentic Workflows for Automated Code Generation | May 24, 2025 | Code Generation | CodeCode Available | 7 |
| AI-Researcher: Autonomous Scientific Innovation | May 24, 2025 | scientific discovery | CodeCode Available | 7 |
| Speechless: Speech Instruction Training Without Speech for Low Resource Languages | May 23, 2025 | speech-recognitionSpeech Recognition | CodeCode Available | 7 |
| ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval | May 22, 2025 | Retrieval | CodeCode Available | 7 |
| An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents | May 21, 2025 | Reinforcement Learning (RL) | CodeCode Available | 7 |
| Visual Agentic Reinforcement Fine-Tuning | May 20, 2025 | Image Manipulation | CodeCode Available | 7 |
| Faster Video Diffusion with Trainable Sparse Attention | May 19, 2025 | | CodeCode Available | 7 |
| MAGI-1: Autoregressive Video Generation at Scale | May 19, 2025 | Video Generation | CodeCode Available | 7 |
| Logo-LLM: Local and Global Modeling with Large Language Models for Time Series Forecasting | May 16, 2025 | Time SeriesTime Series Forecasting | CodeCode Available | 7 |
| SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training | May 16, 2025 | | CodeCode Available | 7 |
| Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis | May 14, 2025 | DenoisingDepth Estimation | CodeCode Available | 7 |
| Fast Text-to-Audio Generation with Adversarial Post-Training | May 13, 2025 | ARCAudio Generation | CodeCode Available | 7 |
| HealthBench: Evaluating Large Language Models Towards Improved Human Health | May 13, 2025 | Instruction FollowingMultiple-choice | CodeCode Available | 7 |
| Embedding Atlas: Low-Friction, Interactive Embedding Visualization | May 9, 2025 | Friction | CodeCode Available | 7 |
| Flow-GRPO: Training Flow Matching Models via Online RL | May 8, 2025 | DenoisingDiversity | CodeCode Available | 7 |
| Practical Efficiency of Muon for Pretraining | May 4, 2025 | | CodeCode Available | 7 |
| Kimi-Audio Technical Report | Apr 25, 2025 | Audio Question AnsweringQuestion Answering | CodeCode Available | 7 |
| RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning | Apr 24, 2025 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 7 |
| Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning | Apr 24, 2025 | Code Generation | CodeCode Available | 7 |
| Step1X-Edit: A Practical Framework for General Image Editing | Apr 24, 2025 | Image Editing | CodeCode Available | 7 |
| Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning | Apr 23, 2025 | Multimodal Reasoningreinforcement-learning | CodeCode Available | 7 |
| TTRL: Test-Time Reinforcement Learning | Apr 22, 2025 | Mathreinforcement-learning | CodeCode Available | 7 |
| PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding | Apr 17, 2025 | Video Question AnsweringVideo Understanding | CodeCode Available | 7 |
| Chinese-Vicuna: A Chinese Instruction-following Llama-based Model | Apr 17, 2025 | Code GenerationCPU | CodeCode Available | 7 |
| BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents | Apr 16, 2025 | | CodeCode Available | 7 |
| Aligning Anime Video Generation with Human Feedback | Apr 14, 2025 | Video Generation | CodeCode Available | 7 |
| The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search | Apr 10, 2025 | scientific discovery | CodeCode Available | 7 |
| A Scalable Approach to Clustering Embedding Projections | Apr 9, 2025 | ClusteringDensity Estimation | CodeCode Available | 7 |
| Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought | Apr 8, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems | Mar 31, 2025 | AutoMLContinual Learning | CodeCode Available | 7 |
| Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model | Mar 31, 2025 | | CodeCode Available | 7 |
| Large Language Model Agent: A Survey on Methodology, Applications and Challenges | Mar 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Open Deep Search: Democratizing Search with Open-source Reasoning Agents | Mar 26, 2025 | 10-shot image generation | CodeCode Available | 7 |
| Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization | Mar 26, 2025 | CPUGPU | CodeCode Available | 7 |
| Qwen2.5-Omni Technical Report | Mar 26, 2025 | Automatic Speech Recognition (ASR)GSM8K | CodeCode Available | 7 |
| Scaling Vision Pre-Training to 4K Resolution | Mar 25, 2025 | 4kContrastive Learning | CodeCode Available | 7 |
| SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild | Mar 24, 2025 | Instruction FollowingMath | CodeCode Available | 7 |
| Enhancing Fourier Neural Operators with Local Spatial Features | Mar 22, 2025 | Computational Efficiency | CodeCode Available | 7 |
| InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity | Mar 20, 2025 | Image Generation | CodeCode Available | 7 |
| xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference | Mar 17, 2025 | MambaMath | CodeCode Available | 7 |
| LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds | Mar 13, 2025 | 3D Human Reconstruction | CodeCode Available | 7 |