| BaryIR: Learning Multi-Source Unified Representation in Continuous Barycenter Space for Generalizable All-in-One Image Restoration | May 27, 2025 | AllImage Restoration | CodeCode Available | 2 |
| R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing | May 27, 2025 | Math | CodeCode Available | 2 |
| Improved Representation Steering for Language Models | May 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Reinforcing General Reasoning without Verifiers | May 27, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction | May 27, 2025 | Image Generation | CodeCode Available | 2 |
| Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models | May 27, 2025 | Concept Alignmentobject-detection | CodeCode Available | 2 |
| UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents | May 27, 2025 | 16k | CodeCode Available | 2 |
| TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state | May 27, 2025 | MambaTime Series | CodeCode Available | 2 |
| The Missing Point in Vision Transformers for Universal Image Segmentation | May 26, 2025 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |
| WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference | May 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue | May 26, 2025 | DiagnosticQuestion Answering | CodeCode Available | 2 |
| Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression | May 26, 2025 | Zero-shot Generalization | CodeCode Available | 2 |
| Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects | May 26, 2025 | Autonomous DrivingLogical Reasoning | CodeCode Available | 2 |
| CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features | May 26, 2025 | | CodeCode Available | 2 |
| Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities | May 26, 2025 | Knowledge GraphsNatural Language Understanding | CodeCode Available | 2 |
| MFA-KWS: Effective Keyword Spotting with Multi-head Frame-asynchronous Decoding | May 26, 2025 | Keyword Spotting | CodeCode Available | 2 |
| Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment | May 26, 2025 | text-to-speechText to Speech | CodeCode Available | 2 |
| AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models | May 26, 2025 | | CodeCode Available | 2 |
| SAEs Are Good for Steering -- If You Select the Right Features | May 26, 2025 | | CodeCode Available | 2 |
| FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching | May 26, 2025 | QuantizationSpeech Enhancement | CodeCode Available | 2 |
| The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project | May 26, 2025 | | CodeCode Available | 2 |
| A Lightweight Hybrid Dual Channel Speech Enhancement System under Low-SNR Conditions | May 26, 2025 | Speech Enhancement | CodeCode Available | 2 |
| EmoSphere-SER: Enhancing Speech Emotion Recognition Through Spherical Representation with Auxiliary Classification | May 26, 2025 | Emotion Recognitionregression | CodeCode Available | 2 |
| Training-Free Multi-Step Audio Source Separation | May 26, 2025 | Audio Source SeparationDenoising | CodeCode Available | 2 |
| Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning | May 26, 2025 | Decision MakingHierarchical Reinforcement Learning | CodeCode Available | 2 |