| Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism | Nov 25, 2022 | GPU | CodeCode Available | 2 |
| Full Parameter Fine-tuning for Large Language Models with Limited Resources | Jun 16, 2023 | GPUparameter-efficient fine-tuning | CodeCode Available | 2 |
| Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries | May 19, 2024 | 6D Pose EstimationGPU | CodeCode Available | 2 |
| Fully-fused Multi-Layer Perceptrons on Intel Data Center GPUs | Mar 26, 2024 | GPUImage Compression | CodeCode Available | 2 |
| Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction | Mar 27, 2024 | 3D Generation3DGS | CodeCode Available | 2 |
| H_2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models | Jun 24, 2023 | GPU | CodeCode Available | 2 |
| ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs | Oct 6, 2022 | GPUVocal Bursts Intensity Prediction | CodeCode Available | 2 |
| Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation | Sep 2, 2024 | GPU | CodeCode Available | 2 |
| Brain Tumour Removing and Missing Modality Generation using 3D WDM | Nov 7, 2024 | GPUPrediction | CodeCode Available | 2 |
| FlashRNN: Optimizing Traditional RNNs on Modern Hardware | Dec 10, 2024 | GPULogical Reasoning | CodeCode Available | 2 |
| Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework | Jun 23, 2020 | BenchmarkingGPU | CodeCode Available | 2 |
| FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation | Mar 4, 2023 | BenchmarkingGPU | CodeCode Available | 2 |
| FP8-LM: Training FP8 Large Language Models | Oct 27, 2023 | GPU | CodeCode Available | 2 |
| BMInf: An Efficient Toolkit for Big Model Inference and Tuning | May 1, 2022 | CPUGPU | CodeCode Available | 2 |
| Black-Box Prompt Optimization: Aligning Large Language Models without Model Training | Nov 7, 2023 | GPU | CodeCode Available | 2 |
| BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache | Mar 24, 2025 | Computational EfficiencyGPU | CodeCode Available | 2 |
| Boundary-Aware Segmentation Network for Mobile and Web Applications | Jan 12, 2021 | Camouflaged Object SegmentationDecoder | CodeCode Available | 2 |
| Birbal: An efficient 7B instruct-model fine-tuned with curated datasets | Mar 4, 2024 | GPU | CodeCode Available | 2 |
| Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity | Sep 19, 2023 | GPU | CodeCode Available | 2 |
| Positive-Unlabeled Compression on the Cloud | Sep 21, 2019 | GPUKnowledge Distillation | CodeCode Available | 2 |
| FRA-RIR: Fast Random Approximation of the Image-source Method | Aug 8, 2022 | DenoisingGPU | CodeCode Available | 2 |
| FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance | Dec 13, 2021 | Deep Reinforcement LearningGPU | CodeCode Available | 2 |
| GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models | Oct 12, 2023 | GPUText to 3D | CodeCode Available | 2 |
| GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving | Nov 19, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference | May 27, 2023 | GPUImage Generation | CodeCode Available | 2 |