| Speed of Light Exact Greedy Decoding for RNN-T Speech Recognition Models on GPU | Jun 6, 2024 | GPUspeech-recognition | —Unverified | 0 |
| ReDistill: Residual Encoded Distillation for Peak Memory Reduction | Jun 6, 2024 | DenoisingGPU | —Unverified | 0 |
| Queue management for slo-oriented large language model serving | Jun 5, 2024 | BlockingGPU | CodeCode Available | 1 |
| Searching Priors Makes Text-to-Video Synthesis Better | Jun 5, 2024 | GPU | —Unverified | 0 |
| A Flexible Recursive Network for Video Stereo Matching Based on Residual Estimation | Jun 5, 2024 | GPUStereo Matching | CodeCode Available | 0 |
| Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity | Jun 5, 2024 | GPUQuantization | —Unverified | 0 |
| A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection | Jun 5, 2024 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control | Jun 4, 2024 | Bandwidth ExtensionCPU | CodeCode Available | 2 |
| Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning | Jun 4, 2024 | document understandingGPU | CodeCode Available | 1 |
| Scalable MatMul-free Language Modeling | Jun 4, 2024 | GPULanguage Modeling | CodeCode Available | 7 |