| EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction | May 29, 2022 | Autonomous DrivingCPU | CodeCode Available | 4 | 5 |
| FFCV: Accelerating Training by Removing Data Bottlenecks | Jun 21, 2023 | CPUGPU | CodeCode Available | 4 | 5 |
| Look Once to Hear: Target Speech Hearing with Noisy Examples | May 10, 2024 | CPUSpeech Extraction | CodeCode Available | 4 | 5 |
| The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models | Mar 14, 2022 | CPUQuantization | CodeCode Available | 4 | 5 |
| T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge | Jun 25, 2024 | Computational EfficiencyCPU | CodeCode Available | 4 | 5 |
| Data-Prep-Kit: getting your data ready for LLM application development | Sep 26, 2024 | CPULanguage Modeling | CodeCode Available | 4 | 5 |
| Vidur: A Large-Scale Simulation Framework For LLM Inference | May 8, 2024 | CPUGPU | CodeCode Available | 4 | 5 |
| Couler: Unified Machine Learning Workflow Optimization in Cloud | Mar 12, 2024 | CPU | CodeCode Available | 4 | 5 |
| InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems | Oct 21, 2024 | Automated Theorem ProvingCPU | CodeCode Available | 4 | 5 |
| DAMO-YOLO : A Report on Real-Time Object Detection Design | Nov 23, 2022 | CPUNeural Architecture Search | CodeCode Available | 4 | 5 |
| DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale | Jun 30, 2022 | CPUGPU | CodeCode Available | 4 | 5 |
| GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III | Apr 8, 2025 | Computational EfficiencyCPU | CodeCode Available | 3 | 5 |
| Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews | Oct 18, 2023 | CPUGPU | CodeCode Available | 3 | 5 |
| Unlimiformer: Long-Range Transformers with Unlimited Length Input | May 2, 2023 | Book summarizationCPU | CodeCode Available | 3 | 5 |
| FlashDMoE: Fast Distributed MoE in a Single Kernel | Jun 5, 2025 | 16kCPU | CodeCode Available | 3 | 5 |
| Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models | Feb 10, 2024 | CPUGPU | CodeCode Available | 3 | 5 |
| Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates | Sep 27, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 | 5 |
| A GPU-specialized Inference Parameter Server for Large-Scale Deep Recommendation Models | Oct 17, 2022 | CPUGPU | CodeCode Available | 3 | 5 |
| SoundStream: An End-to-End Neural Audio Codec | Jul 7, 2021 | CPUDecoder | CodeCode Available | 3 | 5 |
| Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes | Aug 29, 2017 | BIG-bench Machine LearningCPU | CodeCode Available | 3 | 5 |
| Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking | Mar 27, 2022 | CPUMulti-Object Tracking | CodeCode Available | 3 | 5 |
| Nd-BiMamba2: A Unified Bidirectional Architecture for Multi-Dimensional Data Processing | Nov 22, 2024 | Computational EfficiencyCPU | CodeCode Available | 3 | 5 |
| MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices | Dec 28, 2023 | AutoMLCPU | CodeCode Available | 3 | 5 |
| NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU | May 12, 2024 | CPUDeep Learning | CodeCode Available | 3 | 5 |
| MagicPIG: LSH Sampling for Efficient LLM Generation | Oct 21, 2024 | CPUGPU | CodeCode Available | 3 | 5 |