| Kozax: Flexible and Scalable Genetic Programming in JAX | Feb 5, 2025 | GPU | CodeCode Available | 1 |
| Work-Efficient Parallel Non-Maximum Suppression Kernels | Feb 1, 2025 | GPUobject-detection | CodeCode Available | 1 |
| Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models | Jan 31, 2025 | GPUQuantization | CodeCode Available | 1 |
| Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior | Jan 31, 2025 | GPU | CodeCode Available | 1 |
| Return of the Encoder: Maximizing Parameter Efficiency for SLMs | Jan 27, 2025 | Computational EfficiencyCPU | CodeCode Available | 1 |
| CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Jan 14, 2025 | Deep Reinforcement LearningGPU | CodeCode Available | 1 |
| Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping | Jan 11, 2025 | GPULarge Language Model | CodeCode Available | 1 |
| MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection | Jan 10, 2025 | Action DetectionGPU | CodeCode Available | 1 |
| LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations | Jan 5, 2025 | GPU | CodeCode Available | 1 |
| RadarNeXt: Real-Time and Reliable 3D Object Detector Based On 4D mmWave Imaging Radar | Jan 4, 2025 | 3D Object Detection3D Object Detection (RoI) | CodeCode Available | 1 |