| FlashMLA-ETAP: Efficient Transpose Attention Pipeline for Accelerating MLA Inference on NVIDIA H20 GPUs | May 13, 2025 | GPU | CodeCode Available | 1 | 5 |
| Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision | Jun 29, 2022 | Continual LearningGPU | CodeCode Available | 1 | 5 |
| InferCept: Efficient Intercept Support for Augmented Large Language Model Inference | Feb 2, 2024 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| ApiQ: Finetuning of 2-Bit Quantized Large Language Model | Feb 7, 2024 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| Adaptively Placed Multi-Grid Scene Representation Networks for Large-Scale Data Visualization | Jul 16, 2023 | Data VisualizationGPU | CodeCode Available | 1 | 5 |
| ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis | Jul 20, 2022 | CPUGPU | CodeCode Available | 1 | 5 |
| Adaptive Graph Diffusion Networks | Dec 30, 2020 | GPULink Prediction | CodeCode Available | 1 | 5 |
| Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts | Jun 5, 2025 | GPUScheduling | CodeCode Available | 1 | 5 |
| DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation | Mar 30, 2022 | GPU | CodeCode Available | 1 | 5 |
| APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment | Oct 25, 2020 | CPUFace Reenactment | CodeCode Available | 1 | 5 |