| Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting | May 29, 2025 | 3D Scene ReconstructionGPU | CodeCode Available | 1 |
| Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design | May 28, 2025 | GPUQuantization | CodeCode Available | 1 |
| Minute-Long Videos with Dual Parallelisms | May 27, 2025 | DenoisingGPU | CodeCode Available | 1 |
| TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization | May 26, 2025 | CPUGPU | CodeCode Available | 1 |
| ADGSyn: Dual-Stream Learning for Efficient Anticancer Drug Synergy Prediction | May 25, 2025 | GPU | CodeCode Available | 1 |
| PICT -- A Differentiable, GPU-Accelerated Multi-Block PISO Solver for Simulation-Coupled Learning Tasks in Fluid Dynamics | May 22, 2025 | GPU | CodeCode Available | 1 |
| CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark | May 22, 2025 | GPUTranslation | CodeCode Available | 1 |
| JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model | May 22, 2025 | GPULong-range modeling | CodeCode Available | 1 |
| Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis | May 20, 2025 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| Fine-tuning Quantized Neural Networks with Zeroth-order Optimization | May 19, 2025 | GPUQuantization | CodeCode Available | 1 |
| MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning | May 19, 2025 | GPU | CodeCode Available | 1 |
| LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference | May 18, 2025 | GPURetrieval | CodeCode Available | 1 |
| Tiny QA Benchmark++: Ultra-Lightweight, Synthetic Multilingual Dataset Generation & Smoke-Tests for Continuous LLM Evaluation | May 17, 2025 | Dataset GenerationGPU | CodeCode Available | 1 |
| Flash Invariant Point Attention | May 16, 2025 | GPU | CodeCode Available | 1 |
| SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained Devices | May 15, 2025 | CPUGPU | CodeCode Available | 1 |
| FlashMLA-ETAP: Efficient Transpose Attention Pipeline for Accelerating MLA Inference on NVIDIA H20 GPUs | May 13, 2025 | GPU | CodeCode Available | 1 |
| JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes | May 10, 2025 | BenchmarkingGPU | CodeCode Available | 1 |
| Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and Plates | May 9, 2025 | Audio SynthesisCPU | CodeCode Available | 1 |
| Mesh-Learner: Texturing Mesh with Spherical Harmonics | Apr 28, 2025 | 3D ReconstructionCPU | CodeCode Available | 1 |
| Taming the Titans: A Survey of Efficient LLM Inference Serving | Apr 28, 2025 | GPUMiscellaneous | CodeCode Available | 1 |
| Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction | Apr 18, 2025 | 3D Object DetectionGPU | CodeCode Available | 1 |
| Data-efficient LLM Fine-tuning for Code Generation | Apr 17, 2025 | Code GenerationGPU | CodeCode Available | 1 |
| Mask Image Watermarking | Apr 17, 2025 | Computational EfficiencyDecoder | CodeCode Available | 1 |
| Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving | Apr 10, 2025 | GPULarge Language Model | CodeCode Available | 1 |
| HRMedSeg: Unlocking High-resolution Medical Image segmentation via Memory-efficient Attention Modeling | Apr 8, 2025 | DecoderGPU | CodeCode Available | 1 |
| Scaling Graph Neural Networks for Particle Track Reconstruction | Apr 7, 2025 | Edge ClassificationGPU | CodeCode Available | 1 |
| Meta-DAN: towards an efficient prediction strategy for page-level handwritten text recognition | Apr 4, 2025 | GPUHandwritten Text Recognition | CodeCode Available | 1 |
| Quattro: Transformer-Accelerated Iterative Linear Quadratic Regulator Framework for Fast Trajectory Optimization | Apr 2, 2025 | GPUModel Predictive Control | CodeCode Available | 1 |
| Improved Visual-Spatial Reasoning via R1-Zero-Like Training | Apr 1, 2025 | GPUSpatial Reasoning | CodeCode Available | 1 |
| A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction | Mar 25, 2025 | GPU | CodeCode Available | 1 |
| Efficient Self-Supervised Adaptation for Medical Image Analysis | Mar 24, 2025 | GPUMedical Image Analysis | CodeCode Available | 1 |
| Empowering Smaller Models: Tuning LLaMA and Gemma with Chain-of-Thought for Ukrainian Exam Tasks | Mar 18, 2025 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| DIFFVSGG: Diffusion-Driven Online Video Scene Graph Generation | Mar 18, 2025 | DenoisingGPU | CodeCode Available | 1 |
| APLA: A Simple Adaptation Method for Vision Transformers | Mar 14, 2025 | ClassificationGPU | CodeCode Available | 1 |
| Low Complexity Point Tracking of the Myocardium in 2D Echocardiography | Mar 13, 2025 | GPUPoint Tracking | CodeCode Available | 1 |
| Toward Lightweight and Fast Decoders for Diffusion Models in Image and Video Generation | Mar 6, 2025 | DecoderGPU | CodeCode Available | 1 |
| DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting | Mar 4, 2025 | Computational EfficiencyCPU | CodeCode Available | 1 |
| Nature-Inspired Population-Based Evolution of Large Language Models | Mar 3, 2025 | GPUZero-shot Generalization | CodeCode Available | 1 |
| DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting | Mar 2, 2025 | CPUGPU | CodeCode Available | 1 |
| Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking | Mar 1, 2025 | CPUGPU | CodeCode Available | 1 |
| Oscillation-Reduced MXFP4 Training for Vision Transformers | Feb 28, 2025 | GPUQuantization | CodeCode Available | 1 |
| Scalable Signature Kernel Computations for Long Time Series via Local Neumann Series Expansions | Feb 27, 2025 | GPUTime Series | CodeCode Available | 1 |
| Dynamic Low-Rank Sparse Adaptation for Large Language Models | Feb 20, 2025 | CPUGPU | CodeCode Available | 1 |
| Myna: Masking-Based Contrastive Learning of Musical Representations | Feb 18, 2025 | Contrastive LearningData Augmentation | CodeCode Available | 1 |
| AdaSplash: Adaptive Sparse Flash Attention | Feb 17, 2025 | GPULanguage Modeling | CodeCode Available | 1 |
| CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs | Feb 15, 2025 | Computational EfficiencyGPU | CodeCode Available | 1 |
| Small Language Model Makes an Effective Long Text Extractor | Feb 11, 2025 | GPULanguage Modeling | CodeCode Available | 1 |
| Bag of Tricks for Inference-time Computation of LLM Reasoning | Feb 11, 2025 | GPU | CodeCode Available | 1 |
| MERGE^3: Efficient Evolutionary Merging on Consumer-grade GPUs | Feb 9, 2025 | GPU | CodeCode Available | 1 |
| SyMANTIC: An Efficient Symbolic Regression Method for Interpretable and Parsimonious Model Discovery in Science and Beyond | Feb 5, 2025 | feature selectionGPU | CodeCode Available | 1 |