| DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations | Jan 23, 2025 | | CodeCode Available | 7 |
| M&M VTO: Multi-Garment Virtual Try-On and Editing | Jun 6, 2024 | DenoisingSuper-Resolution | CodeCode Available | 7 |
| Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT | Jun 5, 2024 | Image GenerationPoint Cloud Generation | CodeCode Available | 7 |
| EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test | Mar 3, 2025 | Prediction | CodeCode Available | 7 |
| LLaMA-Omni: Seamless Speech Interaction with Large Language Models | Sep 10, 2024 | | CodeCode Available | 7 |
| SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration | Oct 3, 2024 | Image GenerationQuantization | CodeCode Available | 7 |
| Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models | Feb 6, 2023 | Scheduling | CodeCode Available | 7 |
| Is Diversity All You Need for Scalable Robotic Manipulation? | Jul 8, 2025 | AllDiversity | CodeCode Available | 7 |
| VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos | Feb 3, 2025 | Knowledge GraphsRAG | CodeCode Available | 7 |
| Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation | May 28, 2025 | Human AnimationInstruction Following | CodeCode Available | 7 |
| VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction | Jan 3, 2025 | | CodeCode Available | 7 |
| The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning | Nov 30, 2024 | | CodeCode Available | 7 |
| Agentless: Demystifying LLM-based Software Engineering Agents | Jul 1, 2024 | Program Repair | CodeCode Available | 7 |
| DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers | Mar 15, 2024 | Text GenerationVideo Generation | CodeCode Available | 7 |
| SEW: Self-Evolving Agentic Workflows for Automated Code Generation | May 24, 2025 | Code Generation | CodeCode Available | 7 |
| SoftTiger: A Clinical Foundation Model for Healthcare Workflows | Mar 1, 2024 | Language ModellingLarge Language Model | CodeCode Available | 7 |
| SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models | Feb 8, 2024 | BenchmarkingDiversity | CodeCode Available | 7 |
| AFlow: Automating Agentic Workflow Generation | Oct 14, 2024 | Code Generation | CodeCode Available | 7 |
| Enhancing Fourier Neural Operators with Local Spatial Features | Mar 22, 2025 | Computational Efficiency | CodeCode Available | 7 |
| MambaOut: Do We Really Need Mamba for Vision? | May 13, 2024 | image-classificationImage Classification | CodeCode Available | 7 |
| ComfyUI-R1: Exploring Reasoning Models for Workflow Generation | Jun 11, 2025 | 4k | CodeCode Available | 7 |
| Open Deep Search: Democratizing Search with Open-source Reasoning Agents | Mar 26, 2025 | 10-shot image generation | CodeCode Available | 7 |
| Pyramidal Flow Matching for Efficient Video Generative Modeling | Oct 8, 2024 | GPUText-to-Video Generation | CodeCode Available | 7 |
| Speechless: Speech Instruction Training Without Speech for Low Resource Languages | May 23, 2025 | speech-recognitionSpeech Recognition | CodeCode Available | 7 |
| From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations | Jan 3, 2024 | DiversityQuantization | CodeCode Available | 7 |
| Visual-RFT: Visual Reinforcement Fine-Tuning | Mar 3, 2025 | Few-Shot Object DetectionFine-Grained Image Classification | CodeCode Available | 7 |
| Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion | Feb 21, 2022 | BinarizationModel Optimization | CodeCode Available | 7 |
| MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents | Apr 16, 2024 | Fact CheckingRetrieval-augmented Generation | CodeCode Available | 7 |
| Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image | May 30, 2024 | Image to 3DSingle-View 3D Reconstruction | CodeCode Available | 7 |
| TextGrad: Automatic "Differentiation" via Text | Jun 11, 2024 | Question AnsweringSpecificity | CodeCode Available | 7 |
| Efficient multi-prompt evaluation of LLMs | May 27, 2024 | MMLU | CodeCode Available | 7 |
| TTRL: Test-Time Reinforcement Learning | Apr 22, 2025 | Mathreinforcement-learning | CodeCode Available | 7 |
| Elixir: Train a Large Language Model on a Small GPU Cluster | Dec 10, 2022 | CPUGPU | CodeCode Available | 7 |
| Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond | Jan 19, 2025 | Deep LearningMulti-Task Learning | CodeCode Available | 7 |
| PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding | Apr 17, 2025 | Video Question AnsweringVideo Understanding | CodeCode Available | 7 |
| Tulu 3: Pushing Frontiers in Open Language Model Post-Training | Nov 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Measuring Massive Multitask Chinese Understanding | Apr 25, 2023 | All | CodeCode Available | 7 |
| In-Context LoRA for Diffusion Transformers | Oct 31, 2024 | Image Generation | CodeCode Available | 7 |
| FoundationStereo: Zero-Shot Stereo Matching | Jan 17, 2025 | Depth EstimationDiversity | CodeCode Available | 7 |
| Mirage: A Multi-Level Superoptimizer for Tensor Programs | May 9, 2024 | GPUNavigate | CodeCode Available | 7 |
| TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables | Feb 29, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 7 |
| Visual Agentic Reinforcement Fine-Tuning | May 20, 2025 | Image Manipulation | CodeCode Available | 7 |
| Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection | May 16, 2024 | Edge-computingFew-Shot Object Detection | CodeCode Available | 7 |
| Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback | Dec 20, 2024 | AllInstruction Following | CodeCode Available | 7 |
| LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models | Jul 10, 2024 | Video Question AnsweringZero-Shot Video Question Answer | CodeCode Available | 7 |
| Measuring short-form factuality in large language models | Nov 7, 2024 | Form | CodeCode Available | 7 |
| RedPajama: an Open Dataset for Training Large Language Models | Nov 19, 2024 | | CodeCode Available | 7 |
| Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library | Jun 6, 2025 | Management | CodeCode Available | 7 |
| BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents | Apr 16, 2025 | | CodeCode Available | 7 |
| xLSTM: Extended Long Short-Term Memory | May 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |