| ThunderKittens: Simple, Fast, and Adorable AI Kernels | Oct 27, 2024 | GPUState Space Models | CodeCode Available | 7 |
| Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning | Feb 20, 2025 | Mathreinforcement-learning | CodeCode Available | 7 |
| InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity | Mar 20, 2025 | Image Generation | CodeCode Available | 7 |
| A Scalable Approach to Clustering Embedding Projections | Apr 9, 2025 | ClusteringDensity Estimation | CodeCode Available | 7 |
| Real-Time Video Generation with Pyramid Attention Broadcast | Aug 22, 2024 | Video Generation | CodeCode Available | 7 |
| Stable Audio Open | Jul 19, 2024 | Audio GenerationText-to-Music Generation | CodeCode Available | 7 |
| OpenThoughts: Data Recipes for Reasoning Models | Jun 4, 2025 | Math | CodeCode Available | 7 |
| Training AI to be Loyal | Jan 27, 2025 | | CodeCode Available | 7 |
| CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models | Apr 19, 2024 | | CodeCode Available | 7 |
| Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers | May 27, 2025 | | CodeCode Available | 7 |
| MoBA: Mixture of Block Attention for Long-Context LLMs | Feb 18, 2025 | Mixture-of-Experts | CodeCode Available | 7 |
| O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? | Nov 25, 2024 | HallucinationKnowledge Distillation | CodeCode Available | 7 |
| D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement | Oct 17, 2024 | GPUReal-Time Object Detection | CodeCode Available | 7 |
| pySLAM: An Open-Source, Modular, and Extensible Framework for SLAM | Feb 17, 2025 | Depth EstimationDepth Prediction | CodeCode Available | 7 |
| Exploring Compressed Image Representation as a Perceptual Proxy: A Study | Jan 14, 2024 | Image CompressionPerceptual Distance | CodeCode Available | 7 |
| Practical Efficiency of Muon for Pretraining | May 4, 2025 | | CodeCode Available | 7 |
| Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models | May 6, 2023 | Math | CodeCode Available | 7 |
| Low-code LLM: Graphical User Interface over Large Language Models | Apr 17, 2023 | Prompt Engineering | CodeCode Available | 7 |
| O1 Replication Journey: A Strategic Progress Report -- Part 1 | Oct 8, 2024 | Mathscientific discovery | CodeCode Available | 7 |
| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance | Jan 16, 2024 | In-Context Learning | CodeCode Available | 7 |
| 3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting | Dec 17, 2024 | 3DGSNovel View Synthesis | CodeCode Available | 7 |
| Scalable MatMul-free Language Modeling | Jun 4, 2024 | GPULanguage Modeling | CodeCode Available | 7 |
| Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving | Jun 24, 2024 | CPUGPU | CodeCode Available | 7 |
| Seed-TTS: A Family of High-Quality Versatile Speech Generation Models | Jun 4, 2024 | In-Context LearningLanguage Modelling | CodeCode Available | 7 |
| MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark | Jun 5, 2025 | RhythmSpoken Language Understanding | CodeCode Available | 7 |
| EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Jan 26, 2024 | Code GenerationInstruction Following | CodeCode Available | 7 |
| The Prompt Report: A Systematic Survey of Prompting Techniques | Jun 6, 2024 | Prompt EngineeringSurvey | CodeCode Available | 7 |
| Qwen2.5-Omni Technical Report | Mar 26, 2025 | Automatic Speech Recognition (ASR)GSM8K | CodeCode Available | 7 |
| Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation | Mar 1, 2024 | | CodeCode Available | 7 |
| Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems | Mar 31, 2025 | AutoMLContinual Learning | CodeCode Available | 7 |
| Labeling supervised fine-tuning data with the scaling law | May 5, 2024 | coreference-resolutionCoreference Resolution | CodeCode Available | 7 |
| A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models | Jan 21, 2025 | RAGRetrieval | CodeCode Available | 7 |
| When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models | May 16, 2024 | In-Context LearningQuestion Answering | CodeCode Available | 7 |
| DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines | Oct 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| TotalSegmentator MRI: Robust Sequence-independent Segmentation of Multiple Anatomic Structures in MRI | May 29, 2024 | MRI segmentation | CodeCode Available | 7 |
| RouteLLM: Learning to Route LLMs with Preference Data | Jun 26, 2024 | Data AugmentationTransfer Learning | CodeCode Available | 7 |
| InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation | Apr 3, 2024 | Image GenerationText to Image Generation | CodeCode Available | 7 |
| YOLOv12: Attention-Centric Real-Time Object Detectors | Feb 18, 2025 | GPUObject | CodeCode Available | 7 |
| Long-form music generation with latent diffusion | Apr 16, 2024 | Audio GenerationForm | CodeCode Available | 7 |
| LLM-AutoDiff: Auto-Differentiate Any LLM Workflow | Jan 28, 2025 | Prompt EngineeringQuestion Answering | CodeCode Available | 7 |
| Global Structure-from-Motion Revisited | Jul 29, 2024 | 16k | CodeCode Available | 7 |
| Revisiting Feature Prediction for Learning Visual Representations from Video | Feb 15, 2024 | Prediction | CodeCode Available | 7 |
| Fast Text-to-Audio Generation with Adversarial Post-Training | May 13, 2025 | ARCAudio Generation | CodeCode Available | 7 |
| GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot | Dec 3, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 7 |
| V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning | Jun 11, 2025 | Action AnticipationLarge Language Model | CodeCode Available | 7 |
| MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention | Jun 16, 2025 | Mixture-of-ExpertsReinforcement Learning (RL) | CodeCode Available | 7 |
| Flow Matching Guide and Code | Dec 9, 2024 | Text Generation | CodeCode Available | 7 |
| Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads | Jan 19, 2024 | | CodeCode Available | 7 |
| ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI | Oct 1, 2024 | GPUImitation Learning | CodeCode Available | 7 |