| REDUCIO! Generating 10241024 Video within 16 Seconds using Extremely Compressed Motion Latents | Nov 20, 2024 | GPUVideo Generation | CodeCode Available | 3 |
| Data Generation for Hardware-Friendly Post-Training Quantization | Oct 29, 2024 | Data AugmentationGPU | CodeCode Available | 3 |
| ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference | Oct 28, 2024 | CPU | CodeCode Available | 3 |
| Modular Duality in Deep Learning | Oct 28, 2024 | Deep LearningGPU | CodeCode Available | 3 |
| Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss | Oct 22, 2024 | GPURepresentation Learning | CodeCode Available | 3 |
| MagicPIG: LSH Sampling for Efficient LLM Generation | Oct 21, 2024 | CPUGPU | CodeCode Available | 3 |
| CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation | Oct 12, 2024 | Conditional Image GenerationGPU | CodeCode Available | 3 |
| High-Speed Stereo Visual SLAM for Low-Powered Computing Devices | Oct 5, 2024 | GPU | CodeCode Available | 3 |
| SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation | Oct 4, 2024 | 16kCode Generation | CodeCode Available | 3 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 |