| Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis | Jun 5, 2025 | GPUMulti-Label Classification | —Unverified | 0 |
| Generalizable, real-time neural decoding with hybrid state-space models | Jun 5, 2025 | GPUState Space Models | —Unverified | 0 |
| Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos | Jun 5, 2025 | GPUSemantic Segmentation | CodeCode Available | 2 |
| Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers | Jun 5, 2025 | GPUText-to-Video Generation | —Unverified | 0 |
| MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm | Jun 5, 2025 | GPURelation | CodeCode Available | 9 |
| FlashDMoE: Fast Distributed MoE in a Single Kernel | Jun 5, 2025 | 16kCPU | CodeCode Available | 3 |
| High-Speed Ultra-Energy-Efficient Memristor-Based Massive MIMO SIC Detector Circuit with Hybrid Analog-Digital Computing Architecture | Jun 4, 2025 | GPU | —Unverified | 0 |
| Similarity-based fuzzy clustering scientific articles: potentials and challenges from mathematical and computational perspectives | Jun 4, 2025 | ArticlesGPU | —Unverified | 0 |
| FALO: Fast and Accurate LiDAR 3D Object Detection on Resource-Constrained Devices | Jun 4, 2025 | 3D Object DetectionGPU | —Unverified | 0 |
| Diffusion Buffer: Online Diffusion-based Speech Enhancement with Sub-Second Latency | Jun 3, 2025 | GPUSpeech Enhancement | —Unverified | 0 |
| VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians | Jun 3, 2025 | GPUSimultaneous Localization and Mapping | —Unverified | 0 |
| Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem | Jun 3, 2025 | GPUMath | —Unverified | 0 |
| SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics | Jun 2, 2025 | Action GenerationGPU | CodeCode Available | 11 |
| COALESCE: Economic and Security Dynamics of Skill-Based Task Outsourcing Among Team of Autonomous LLM Agents | Jun 2, 2025 | GPULarge Language Model | —Unverified | 0 |
| Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization | May 30, 2025 | GPUKnowledge Distillation | —Unverified | 0 |
| Recipes for Pre-training LLMs with MXFP8 | May 30, 2025 | GPU | —Unverified | 0 |
| Pushing the Limits of Beam Search Decoding for Transducer-based ASR models | May 30, 2025 | GPU | —Unverified | 0 |
| NUC-Net: Non-uniform Cylindrical Partition Network for Efficient LiDAR Semantic Segmentation | May 30, 2025 | Autonomous DrivingGPU | CodeCode Available | 0 |
| AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning | May 30, 2025 | GPUMath | CodeCode Available | 7 |
| TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks | May 29, 2025 | GPUNetwork Pruning | —Unverified | 0 |
| LlamaRL: A Distributed Asynchronous Reinforcement Learning Framework for Efficient Large-scale LLM Trainin | May 29, 2025 | GPUReinforcement Learning (RL) | —Unverified | 0 |
| LoLA: Low-Rank Linear Attention With Sparse Caching | May 29, 2025 | 4k8k | —Unverified | 0 |
| CF-DETR: Coarse-to-Fine Transformer for Real-Time Object Detection | May 29, 2025 | GPUobject-detection | —Unverified | 0 |
| Accelerating AllReduce with a Persistent Straggler | May 29, 2025 | GPU | CodeCode Available | 1 |
| Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting | May 29, 2025 | 3D Scene ReconstructionGPU | CodeCode Available | 1 |
| LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering | May 29, 2025 | 3DGSGPU | —Unverified | 0 |
| ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS | May 29, 2025 | 3DGSGPU | CodeCode Available | 2 |
| LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics | May 29, 2025 | GPU | —Unverified | 0 |
| Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design | May 28, 2025 | GPUQuantization | CodeCode Available | 1 |
| Fast Feature Matching of UAV Images via Matrix Band Reduction-based GPU Data Schedule | May 28, 2025 | CPUGPU | —Unverified | 0 |
| SHTOcc: Effective 3D Occupancy Prediction with Sparse Head and Tail Voxels | May 28, 2025 | Autonomous DrivingGPU | CodeCode Available | 0 |
| Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape | May 28, 2025 | GPU | CodeCode Available | 0 |
| NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding | May 28, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control | May 28, 2025 | GPUHumanoid Control | —Unverified | 0 |
| CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs | May 28, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Minute-Long Videos with Dual Parallelisms | May 27, 2025 | DenoisingGPU | CodeCode Available | 1 |
| STACI: Spatio-Temporal Aleatoric Conformal Inference | May 27, 2025 | Gaussian ProcessesGPU | —Unverified | 0 |
| Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks | May 27, 2025 | GPU | —Unverified | 0 |
| Fast and Cost-effective Speculative Edge-Cloud Decoding with Early Exits | May 27, 2025 | GPU | —Unverified | 0 |
| InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling | May 27, 2025 | DenoisingGPU | —Unverified | 0 |
| SwarmThinkers: Learning Physically Consistent Atomic KMC Transitions at Scale | May 26, 2025 | Decision MakingGPU | —Unverified | 0 |
| APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization | May 26, 2025 | GPUNews Summarization | CodeCode Available | 0 |
| TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization | May 26, 2025 | CPUGPU | CodeCode Available | 1 |
| FinLoRA: Benchmarking LoRA Methods for Fine-Tuning LLMs on Financial Datasets | May 26, 2025 | BenchmarkingGPU | CodeCode Available | 0 |
| Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling | May 26, 2025 | GPUtext-to-speech | —Unverified | 0 |
| eACGM: Non-instrumented Performance Tracing and Anomaly Detection towards Machine Learning Systems | May 25, 2025 | Anomaly DetectionFault Diagnosis | —Unverified | 0 |
| TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis | May 25, 2025 | CPUGPU | —Unverified | 0 |
| Triangle Splatting for Real-Time Radiance Field Rendering | May 25, 2025 | GPUNeRF | —Unverified | 0 |
| Advancing Video Self-Supervised Learning via Image Foundation Models | May 25, 2025 | GPURepresentation Learning | CodeCode Available | 0 |
| FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization | May 25, 2025 | Computational EfficiencyCPU | —Unverified | 0 |