| AudioLCM: Text-to-Audio Generation with Latent Consistency Models | Jun 1, 2024 | Audio GenerationAudio Synthesis | CodeCode Available | 5 |
| Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments | Jan 10, 2023 | GPUImitation Learning | CodeCode Available | 5 |
| DEIM: DETR with Improved Matching for Fast Convergence | Dec 5, 2024 | Data AugmentationGPU | CodeCode Available | 5 |
| MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention | Apr 22, 2025 | GPU | CodeCode Available | 5 |
| Fast On-device LLM Inference with NPUs | Jul 8, 2024 | CPUGPU | CodeCode Available | 5 |
| FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning | Feb 29, 2024 | GPULanguage Modeling | CodeCode Available | 5 |
| FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation | Oct 16, 2024 | Audio GenerationGPU | CodeCode Available | 5 |
| FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion | Jun 11, 2024 | GPU | CodeCode Available | 5 |
| MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models | Aug 21, 2024 | GPUQuantization | CodeCode Available | 5 |
| LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models | Nov 8, 2023 | 8kGPU | CodeCode Available | 5 |