| CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs | Sep 19, 2024 | GPU | CodeCode Available | 1 | 5 |
| LOGO -- Long cOntext aliGnment via efficient preference Optimization | Oct 24, 2024 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| Dynamic GPU Energy Optimization for Machine Learning Training Workloads | Jan 5, 2022 | BIG-bench Machine LearningGPU | CodeCode Available | 1 | 5 |
| DAGER: Exact Gradient Inversion for Large Language Models | May 24, 2024 | DecoderFederated Learning | CodeCode Available | 1 | 5 |
| Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices | Jan 17, 2024 | Dynamic neural networksGPU | CodeCode Available | 1 | 5 |
| Transformer Tracking | Mar 29, 2021 | GPUObject Tracking | CodeCode Available | 1 | 5 |
| Dynamic Low-Rank Sparse Adaptation for Large Language Models | Feb 20, 2025 | CPUGPU | CodeCode Available | 1 | 5 |
| Efficient Quantized Sparse Matrix Operations on Tensor Cores | Sep 14, 2022 | GPUQuantization | CodeCode Available | 1 | 5 |
| LiteTrack: Layer Pruning with Asynchronous Feature Extraction for Lightweight and Efficient Visual Tracking | Sep 17, 2023 | GPUVisual Tracking | CodeCode Available | 1 | 5 |
| CrAM: A Compression-Aware Minimizer | Jul 28, 2022 | GPUImage Classification | CodeCode Available | 1 | 5 |