| KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization | Jan 31, 2024 | GPUQuantization | CodeCode Available | 3 |
| Cramming: Training a Language Model on a Single GPU in One Day | Dec 28, 2022 | GPULanguage Modeling | CodeCode Available | 3 |
| Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning | Feb 26, 2024 | GPUMinecraft | CodeCode Available | 3 |
| InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation | Aug 28, 2024 | Cell SegmentationGPU | CodeCode Available | 3 |
| Allo: A Programming Model for Composable Accelerator Design | Apr 7, 2024 | GPUHigh-Level Synthesis | CodeCode Available | 3 |
| GPU-accelerated Evolutionary Multiobjective Optimization Using Tensorized RVEA | Apr 1, 2024 | GPUMultiobjective Optimization | CodeCode Available | 3 |
| Data Generation for Hardware-Friendly Post-Training Quantization | Oct 29, 2024 | Data AugmentationGPU | CodeCode Available | 3 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| IMDL-BenCo: A Comprehensive Benchmark and Codebase for Image Manipulation Detection & Localization | Jun 15, 2024 | GPUImage Manipulation | CodeCode Available | 3 |
| Consistency Models Made Easy | Jun 20, 2024 | Computational EfficiencyGPU | CodeCode Available | 3 |