| Short-Range Dependency Effects on Transformer Instability and a Decomposed Attention Solution | May 21, 2025 | GPULanguage Modeling | —Unverified | 0 |
| Small Language Models in the Real World: Insights from Industrial Text Classification | May 21, 2025 | ClassificationDecoder | —Unverified | 0 |
| RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation | May 21, 2025 | GPUNatural Language Queries | —Unverified | 0 |
| DeepCEE: Efficient Cross-Region Model Distributed Training System under Heterogeneous GPUs and Networks | May 21, 2025 | GPUPhilosophy | —Unverified | 0 |
| Flashback: Memory-Driven Zero-shot, Real-time Video Anomaly Detection | May 21, 2025 | Anomaly DetectionGPU | —Unverified | 0 |
| Guidelines for the Quality Assessment of Energy-Aware NAS Benchmarks | May 21, 2025 | BenchmarkingGPU | —Unverified | 0 |
| ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs | May 20, 2025 | GPULarge Language Model | —Unverified | 0 |
| UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache | May 20, 2025 | 4k8k | —Unverified | 0 |
| Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity | May 20, 2025 | GPULarge Language Model | CodeCode Available | 0 |
| Balanced and Elastic End-to-end Training of Dynamic LLMs | May 20, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| 4D-ROLLS: 4D Radar Occupancy Learning via LiDAR Supervision | May 20, 2025 | Autonomous VehiclesBEV Segmentation | CodeCode Available | 0 |
| TSPulse: Dual Space Tiny Pre-Trained Models for Rapid Time-Series Analysis | May 19, 2025 | Anomaly DetectionDisentanglement | —Unverified | 0 |
| FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference | May 19, 2025 | CPUGPU | —Unverified | 0 |
| Half Search Space is All You Need | May 19, 2025 | AllGPU | —Unverified | 0 |
| Frozen Backpropagation: Relaxing Weight Symmetry in Temporally-Coded Deep Spiking Neural Networks | May 19, 2025 | GPU | CodeCode Available | 0 |
| CALM: Co-evolution of Algorithms and Language Model for Automatic Heuristic Design | May 18, 2025 | GPULanguage Modeling | —Unverified | 0 |
| A Case for Library-Level k-Means Binning in Histogram Gradient-Boosted Trees | May 18, 2025 | GPU | CodeCode Available | 0 |
| HybridServe: Efficient Serving of Large AI Models with Confidence-Based Cascade Routing | May 18, 2025 | GPU | —Unverified | 0 |
| ZenFlow: Enabling Stall-Free Offloading Training via Asynchronous Updates | May 18, 2025 | CPUGPU | —Unverified | 0 |
| VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold | May 18, 2025 | GPU | —Unverified | 0 |
| HessFormer: Hessians at Foundation Scale | May 16, 2025 | GPU | —Unverified | 0 |
| Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity | May 16, 2025 | GPU | —Unverified | 0 |
| From Hand-Crafted Metrics to Evolved Training-Free Performance Predictors for Neural Architecture Search via Genetic Programming | May 16, 2025 | GPUNeural Architecture Search | —Unverified | 0 |
| Entropy-Driven Genetic Optimization for Deep-Feature-Guided Low-Light Image Enhancement | May 16, 2025 | GPUImage Enhancement | CodeCode Available | 0 |
| Gaussian Weight Sampling for Scalable, Efficient and Stable Pseudo-Quantization Training | May 16, 2025 | GPUQuantization | —Unverified | 0 |
| Single-shot prediction of parametric partial differential equations | May 14, 2025 | CPUGPU | —Unverified | 0 |
| Generative Molecular Design with Steerable and Granular Synthesizability Control | May 13, 2025 | GPU | —Unverified | 0 |
| Scaling Multi Agent Reinforcement Learning for Underwater Acoustic Tracking via Autonomous Vehicles | May 13, 2025 | Autonomous VehiclesGPU | —Unverified | 0 |
| AI Accelerators for Large Language Model In-ference: Architecture Analysis and Scaling Strategies | May 13, 2025 | GPULanguage Modeling | —Unverified | 0 |
| Fused3S: Fast Sparse Attention on Tensor Cores | May 12, 2025 | GPU | CodeCode Available | 0 |
| Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption | May 12, 2025 | GPUKnowledge Base Question Answering | —Unverified | 0 |
| L-SWAG: Layer-Sample Wise Activation with Gradients information for Zero-Shot NAS on Vision Transformers | May 12, 2025 | GPUNeural Architecture Search | —Unverified | 0 |
| Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains | May 12, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| SLAG: Scalable Language-Augmented Gaussian Splatting | May 12, 2025 | GPULanguage Modeling | —Unverified | 0 |
| On the Cost and Benefits of Training Context with Utterance or Full Conversation Training: A Comparative Stud | May 12, 2025 | GPUHallucination | —Unverified | 0 |
| Matrix Is All You Need | May 11, 2025 | AllGPU | —Unverified | 0 |
| Streaming Krylov-Accelerated Stochastic Gradient Descent | May 11, 2025 | GPUStochastic Optimization | —Unverified | 0 |
| QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration | May 10, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference | May 9, 2025 | CPUGPU | —Unverified | 0 |
| FloE: On-the-Fly MoE Inference on Memory-constrained GPU | May 9, 2025 | CPUGPU | —Unverified | 0 |
| UltraGauss: Ultrafast Gaussian Reconstruction of 3D Ultrasound Volumes | May 8, 2025 | 3D ReconstructionComputational Efficiency | —Unverified | 0 |
| Boosting Performance on ARC is a Matter of Perspective | May 8, 2025 | ARCGPU | —Unverified | 0 |
| Steepest Descent Density Control for Compact 3D Gaussian Splatting | May 8, 2025 | 3DGSGPU | —Unverified | 0 |
| Supporting renewable energy planning and operation with data-driven high-resolution ensemble weather forecast | May 7, 2025 | CPUGPU | —Unverified | 0 |
| Leveraging Simultaneous Usage of Edge GPU Hardware Engines for Video Face Detection and Recognition | May 7, 2025 | Face DetectionFace Recognition | —Unverified | 0 |
| Plexus: Taming Billion-edge Graphs with 3D Parallel GNN Training | May 7, 2025 | CPUGPU | —Unverified | 0 |
| Edge-GPU Based Face Tracking for Face Detection and Recognition Acceleration | May 7, 2025 | CPUFace Detection | —Unverified | 0 |
| LONGER: Scaling Up Long Sequence Modeling in Industrial Recommenders | May 7, 2025 | GPURecommendation Systems | —Unverified | 0 |
| AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning | May 6, 2025 | Active LearningAnomaly Detection | CodeCode Available | 0 |
| Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving | May 6, 2025 | GPUScheduling | —Unverified | 0 |