| Theseus: A Library for Differentiable Nonlinear Optimization | Jul 19, 2022 | GPU | CodeCode Available | 4 | 5 |
| Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation | Jun 4, 2024 | Face SwappingGPU | CodeCode Available | 4 | 5 |
| NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals | Jul 18, 2024 | Experimental DesignGPU | CodeCode Available | 4 | 5 |
| OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit | May 12, 2025 | GPUPrivacy Preserving | CodeCode Available | 4 | 5 |
| Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion | Jan 27, 2023 | GPUImage Generation | CodeCode Available | 4 | 5 |
| QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving | May 7, 2024 | GPULanguage Modelling | CodeCode Available | 4 | 5 |
| Building reliable sim driving agents by scaling self-play | Feb 20, 2025 | Autonomous VehiclesBenchmarking | CodeCode Available | 4 | 5 |
| Multi-head Temporal Latent Attention | May 19, 2025 | GPUspeech-recognition | CodeCode Available | 4 | 5 |
| High-Resolution Image Synthesis with Latent Diffusion Models | Dec 20, 2021 | DenoisingGPU | CodeCode Available | 4 | 5 |
| On Scaling Up 3D Gaussian Splatting Training | Jun 26, 2024 | 3DGS3D Reconstruction | CodeCode Available | 4 | 5 |
| 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float | Apr 15, 2025 | CPUGPU | CodeCode Available | 4 | 5 |
| FedML Parrot: A Scalable Federated Learning System via Heterogeneity-aware Scheduling on Sequential and Hierarchical Training | Mar 3, 2023 | Federated LearningGPU | CodeCode Available | 4 | 5 |
| DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality | Oct 25, 2022 | Deep Reinforcement LearningGPU | CodeCode Available | 4 | 5 |
| MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts | Oct 9, 2024 | GPUMixture-of-Experts | CodeCode Available | 4 | 5 |
| Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training | Oct 28, 2021 | Deep LearningGPU | CodeCode Available | 4 | 5 |
| FFCV: Accelerating Training by Removing Data Bottlenecks | Jun 21, 2023 | CPUGPU | CodeCode Available | 4 | 5 |
| Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints | Apr 15, 2025 | GPUInference Optimization | CodeCode Available | 4 | 5 |
| JAX-Fluids 2.0: Towards HPC for Differentiable CFD of Compressible Two-phase Flows | Feb 7, 2024 | GPU | CodeCode Available | 4 | 5 |
| fastai: A Layered API for Deep Learning | Feb 11, 2020 | Deep LearningGPU | CodeCode Available | 4 | 5 |
| Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference | Oct 6, 2023 | GPUImage Generation | CodeCode Available | 4 | 5 |
| 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering | Oct 12, 2023 | Dynamic ReconstructionGPU | CodeCode Available | 4 | 5 |
| EvoX: A Distributed GPU-accelerated Framework for Scalable Evolutionary Computation | Jan 29, 2023 | GPUNavigate | CodeCode Available | 4 | 5 |
| Mamba-FETrack: Frame-Event Tracking via State Space Model | Apr 28, 2024 | GPUMamba | CodeCode Available | 4 | 5 |
| LCM-LoRA: A Universal Stable-Diffusion Acceleration Module | Nov 9, 2023 | GPUImage Generation | CodeCode Available | 4 | 5 |
| Accelerating Visual-Policy Learning through Parallel Differentiable Simulation | May 15, 2025 | GPU | CodeCode Available | 4 | 5 |