| Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models | Jun 23, 2025 | Domain AdaptationGPU | CodeCode Available | 3 | 5 |
| EfficientQAT: Efficient Quantization-Aware Training for Large Language Models | Jul 10, 2024 | GPUQuantization | CodeCode Available | 3 | 5 |
| Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray | Feb 7, 2025 | 4kGeneral Knowledge | CodeCode Available | 3 | 5 |
| MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Apr 8, 2024 | GPUMultiple-choice | CodeCode Available | 3 | 5 |
| Merlin: A Vision Language Foundation Model for 3D Computed Tomography | Jun 10, 2024 | 3D Semantic SegmentationComputed Tomography (CT) | CodeCode Available | 3 | 5 |
| 94% on CIFAR-10 in 3.29 Seconds on a Single GPU | Mar 30, 2024 | GPU | CodeCode Available | 3 | 5 |
| LiteGS: A High-Performance Modular Framework for Gaussian Splatting Training | Mar 3, 2025 | 3DGSGPU | CodeCode Available | 3 | 5 |
| Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models | Jan 9, 2024 | GPU | CodeCode Available | 3 | 5 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 | 5 |
| LinFusion: 1 GPU, 1 Minute, 16K Image | Sep 3, 2024 | 16kCausal Inference | CodeCode Available | 3 | 5 |