| Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting | May 29, 2025 | 3D Scene ReconstructionGPU | CodeCode Available | 1 |
| Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design | May 28, 2025 | GPUQuantization | CodeCode Available | 1 |
| Minute-Long Videos with Dual Parallelisms | May 27, 2025 | DenoisingGPU | CodeCode Available | 1 |
| TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization | May 26, 2025 | CPUGPU | CodeCode Available | 1 |
| ADGSyn: Dual-Stream Learning for Efficient Anticancer Drug Synergy Prediction | May 25, 2025 | GPU | CodeCode Available | 1 |
| CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark | May 22, 2025 | GPUTranslation | CodeCode Available | 1 |
| PICT -- A Differentiable, GPU-Accelerated Multi-Block PISO Solver for Simulation-Coupled Learning Tasks in Fluid Dynamics | May 22, 2025 | GPU | CodeCode Available | 1 |
| JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model | May 22, 2025 | GPULong-range modeling | CodeCode Available | 1 |
| Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis | May 20, 2025 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning | May 19, 2025 | GPU | CodeCode Available | 1 |
| Fine-tuning Quantized Neural Networks with Zeroth-order Optimization | May 19, 2025 | GPUQuantization | CodeCode Available | 1 |
| LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference | May 18, 2025 | GPURetrieval | CodeCode Available | 1 |
| Tiny QA Benchmark++: Ultra-Lightweight, Synthetic Multilingual Dataset Generation & Smoke-Tests for Continuous LLM Evaluation | May 17, 2025 | Dataset GenerationGPU | CodeCode Available | 1 |
| Flash Invariant Point Attention | May 16, 2025 | GPU | CodeCode Available | 1 |
| SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained Devices | May 15, 2025 | CPUGPU | CodeCode Available | 1 |
| FlashMLA-ETAP: Efficient Transpose Attention Pipeline for Accelerating MLA Inference on NVIDIA H20 GPUs | May 13, 2025 | GPU | CodeCode Available | 1 |
| JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes | May 10, 2025 | BenchmarkingGPU | CodeCode Available | 1 |
| Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and Plates | May 9, 2025 | Audio SynthesisCPU | CodeCode Available | 1 |
| Taming the Titans: A Survey of Efficient LLM Inference Serving | Apr 28, 2025 | GPUMiscellaneous | CodeCode Available | 1 |
| Mesh-Learner: Texturing Mesh with Spherical Harmonics | Apr 28, 2025 | 3D ReconstructionCPU | CodeCode Available | 1 |
| Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction | Apr 18, 2025 | 3D Object DetectionGPU | CodeCode Available | 1 |
| Data-efficient LLM Fine-tuning for Code Generation | Apr 17, 2025 | Code GenerationGPU | CodeCode Available | 1 |
| Mask Image Watermarking | Apr 17, 2025 | Computational EfficiencyDecoder | CodeCode Available | 1 |
| Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving | Apr 10, 2025 | GPULarge Language Model | CodeCode Available | 1 |
| HRMedSeg: Unlocking High-resolution Medical Image segmentation via Memory-efficient Attention Modeling | Apr 8, 2025 | DecoderGPU | CodeCode Available | 1 |