| MagicPIG: LSH Sampling for Efficient LLM Generation | Oct 21, 2024 | CPUGPU | CodeCode Available | 3 |
| CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation | Oct 12, 2024 | Conditional Image GenerationGPU | CodeCode Available | 3 |
| High-Speed Stereo Visual SLAM for Low-Powered Computing Devices | Oct 5, 2024 | GPU | CodeCode Available | 3 |
| SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation | Oct 4, 2024 | 16kCode Generation | CodeCode Available | 3 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| Simple and Fast Distillation of Diffusion Models | Sep 29, 2024 | GPUImage Generation | CodeCode Available | 3 |
| 3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt | Sep 19, 2024 | 3DGSGPU | CodeCode Available | 3 |
| LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture | Sep 4, 2024 | GPUMamba | CodeCode Available | 3 |
| LinFusion: 1 GPU, 1 Minute, 16K Image | Sep 3, 2024 | 16kCausal Inference | CodeCode Available | 3 |
| InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation | Aug 28, 2024 | Cell SegmentationGPU | CodeCode Available | 3 |
| The Mamba in the Llama: Distilling and Accelerating Hybrid Models | Aug 27, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| OctFusion: Octree-based Diffusion Models for 3D Shape Generation | Aug 27, 2024 | 3D Generation3D Shape Generation | CodeCode Available | 3 |
| Accelerating Goal-Conditioned RL Algorithms and Research | Aug 20, 2024 | GPUreinforcement-learning | CodeCode Available | 3 |
| ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models | Aug 16, 2024 | GPUModel Compression | CodeCode Available | 3 |
| LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale | Aug 10, 2024 | GPULanguage Modelling | CodeCode Available | 3 |
| UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling | Aug 9, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| Practical Video Object Detection via Feature Selection and Aggregation | Jul 29, 2024 | feature selectionGPU | CodeCode Available | 3 |
| vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving | Jul 22, 2024 | CPUGPU | CodeCode Available | 3 |
| Inference Performance Optimization for Large Language Models on CPUs | Jul 10, 2024 | CPUGPU | CodeCode Available | 3 |
| EfficientQAT: Efficient Quantization-Aware Training for Large Language Models | Jul 10, 2024 | GPUQuantization | CodeCode Available | 3 |
| Consistency Models Made Easy | Jun 20, 2024 | Computational EfficiencyGPU | CodeCode Available | 3 |
| VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models | Jun 19, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| IMDL-BenCo: A Comprehensive Benchmark and Codebase for Image Manipulation Detection & Localization | Jun 15, 2024 | GPUImage Manipulation | CodeCode Available | 3 |
| AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring | Jun 13, 2024 | DeblurringDecoder | CodeCode Available | 3 |
| Merlin: A Vision Language Foundation Model for 3D Computed Tomography | Jun 10, 2024 | 3D Semantic SegmentationComputed Tomography (CT) | CodeCode Available | 3 |