| BMInf: An Efficient Toolkit for Big Model Inference and Tuning | May 1, 2022 | CPUGPU | CodeCode Available | 2 |
| FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation | Mar 4, 2023 | BenchmarkingGPU | CodeCode Available | 2 |
| From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients | Jul 15, 2024 | GPU | CodeCode Available | 2 |
| Gradient Boosting Reinforcement Learning | Jul 11, 2024 | GPUreinforcement-learning | CodeCode Available | 2 |
| mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs | Dec 5, 2023 | GPULarge Language Model | CodeCode Available | 2 |
| gCastle: A Python Toolbox for Causal Discovery | Nov 30, 2021 | Causal DiscoveryGPU | CodeCode Available | 2 |
| Atom: Low-bit Quantization for Efficient and Accurate LLM Serving | Oct 29, 2023 | GPUQuantization | CodeCode Available | 2 |
| Characterization of Large Language Model Development in the Datacenter | Mar 12, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Birbal: An efficient 7B instruct-model fine-tuned with curated datasets | Mar 4, 2024 | GPU | CodeCode Available | 2 |
| BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache | Mar 24, 2025 | Computational EfficiencyGPU | CodeCode Available | 2 |
| HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading | Feb 18, 2025 | Computational EfficiencyCPU | CodeCode Available | 2 |
| Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow | Jun 3, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| BiFormer: Vision Transformer with Bi-Level Routing Attention | Mar 15, 2023 | Computational EfficiencyGPU | CodeCode Available | 2 |
| Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity | Sep 19, 2023 | GPU | CodeCode Available | 2 |
| Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient | Nov 26, 2024 | GPUImage Generation | CodeCode Available | 2 |
| CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra | Sep 6, 2023 | CoLAGaussian Processes | CodeCode Available | 2 |
| FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance | Dec 13, 2021 | Deep Reinforcement LearningGPU | CodeCode Available | 2 |
| FeNNol: an Efficient and Flexible Library for Building Force-field-enhanced Neural Network Potentials | May 2, 2024 | GPU | CodeCode Available | 2 |
| CoLLiE: Collaborative Training of Large Language Models in an Efficient Way | Dec 1, 2023 | GPUparameter-efficient fine-tuning | CodeCode Available | 2 |
| HybridDepth: Robust Metric Depth Fusion by Leveraging Depth from Focus and Single-Image Priors | Jul 26, 2024 | Depth EstimationGPU | CodeCode Available | 2 |
| Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference | May 27, 2023 | GPUImage Generation | CodeCode Available | 2 |
| FeatureBooster: Boosting Feature Descriptors with a Lightweight Neural Network | Nov 28, 2022 | GPUVisual Localization | CodeCode Available | 2 |
| AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and state space models | Nov 11, 2024 | Audio Super-ResolutionGPU | CodeCode Available | 2 |
| A Tensor Compiler for Unified Machine Learning Prediction Serving | Oct 9, 2020 | BIG-bench Machine LearningCPU | CodeCode Available | 2 |
| Feature Pyramid Networks for Object Detection | Dec 9, 2016 | GPUObject | CodeCode Available | 2 |