| Generalizable, real-time neural decoding with hybrid state-space models | Jun 5, 2025 | GPUState Space Models | —Unverified | 0 |
| Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos | Jun 5, 2025 | GPUSemantic Segmentation | CodeCode Available | 2 |
| Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts | Jun 5, 2025 | GPUScheduling | CodeCode Available | 1 |
| MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm | Jun 5, 2025 | GPURelation | CodeCode Available | 9 |
| Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis | Jun 5, 2025 | GPUMulti-Label Classification | —Unverified | 0 |
| FlashDMoE: Fast Distributed MoE in a Single Kernel | Jun 5, 2025 | 16kCPU | CodeCode Available | 3 |
| Similarity-based fuzzy clustering scientific articles: potentials and challenges from mathematical and computational perspectives | Jun 4, 2025 | ArticlesGPU | —Unverified | 0 |
| High-Speed Ultra-Energy-Efficient Memristor-Based Massive MIMO SIC Detector Circuit with Hybrid Analog-Digital Computing Architecture | Jun 4, 2025 | GPU | —Unverified | 0 |
| FALO: Fast and Accurate LiDAR 3D Object Detection on Resource-Constrained Devices | Jun 4, 2025 | 3D Object DetectionGPU | —Unverified | 0 |
| Diffusion Buffer: Online Diffusion-based Speech Enhancement with Sub-Second Latency | Jun 3, 2025 | GPUSpeech Enhancement | —Unverified | 0 |
| VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians | Jun 3, 2025 | GPUSimultaneous Localization and Mapping | —Unverified | 0 |
| Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem | Jun 3, 2025 | GPUMath | —Unverified | 0 |
| SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics | Jun 2, 2025 | Action GenerationGPU | CodeCode Available | 11 |
| COALESCE: Economic and Security Dynamics of Skill-Based Task Outsourcing Among Team of Autonomous LLM Agents | Jun 2, 2025 | GPULarge Language Model | —Unverified | 0 |
| Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization | May 30, 2025 | GPUKnowledge Distillation | —Unverified | 0 |
| Recipes for Pre-training LLMs with MXFP8 | May 30, 2025 | GPU | —Unverified | 0 |
| Pushing the Limits of Beam Search Decoding for Transducer-based ASR models | May 30, 2025 | GPU | —Unverified | 0 |
| NUC-Net: Non-uniform Cylindrical Partition Network for Efficient LiDAR Semantic Segmentation | May 30, 2025 | Autonomous DrivingGPU | CodeCode Available | 0 |
| AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning | May 30, 2025 | GPUMath | CodeCode Available | 7 |
| TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks | May 29, 2025 | GPUNetwork Pruning | —Unverified | 0 |
| LlamaRL: A Distributed Asynchronous Reinforcement Learning Framework for Efficient Large-scale LLM Trainin | May 29, 2025 | GPUReinforcement Learning (RL) | —Unverified | 0 |
| LoLA: Low-Rank Linear Attention With Sparse Caching | May 29, 2025 | 4k8k | —Unverified | 0 |
| Accelerating AllReduce with a Persistent Straggler | May 29, 2025 | GPU | CodeCode Available | 1 |
| LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics | May 29, 2025 | GPU | —Unverified | 0 |
| CF-DETR: Coarse-to-Fine Transformer for Real-Time Object Detection | May 29, 2025 | GPUobject-detection | —Unverified | 0 |