| Normalizing Flows are Capable Models for RL | May 29, 2025 | Imitation LearningReinforcement Learning (RL) | CodeCode Available | 1 |
| Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging | May 29, 2025 | | CodeCode Available | 1 |
| 3DGEER: Exact and Efficient Volumetric Rendering with 3D Gaussians | May 29, 2025 | 3DGSNeural Rendering | CodeCode Available | 1 |
| ProDiff: Prototype-Guided Diffusion for Minimal Information Trajectory Imputation | May 29, 2025 | DenoisingImputation | CodeCode Available | 1 |
| VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation | May 29, 2025 | Caption GenerationLanguage Modeling | CodeCode Available | 1 |
| TimePoint: Accelerated Time Series Alignment via Self-Supervised Keypoint and Descriptor Learning | May 29, 2025 | Dynamic Time WarpingKeypoint Detection | CodeCode Available | 1 |
| Context Robust Knowledge Editing for Language Models | May 29, 2025 | knowledge editing | CodeCode Available | 1 |
| Toward Memory-Aided World Models: Benchmarking via Spatial Consistency | May 29, 2025 | BenchmarkingMinecraft | CodeCode Available | 1 |
| DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning | May 29, 2025 | Automated Theorem ProvingMathematical Reasoning | CodeCode Available | 1 |
| SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents | May 29, 2025 | Adversarial AttackLarge Language Model | CodeCode Available | 1 |
| Data-to-Dashboard: Multi-Agent LLM Framework for Insightful Visualization in Enterprise Analytics | May 29, 2025 | | CodeCode Available | 1 |
| Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition | May 29, 2025 | Handwritten Mathmatical Expression RecognitionLanguage Modeling | CodeCode Available | 1 |
| ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind | May 29, 2025 | | CodeCode Available | 1 |
| Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles | May 29, 2025 | Reinforcement Learning (RL) | CodeCode Available | 1 |
| URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration | May 29, 2025 | DeblurringImage Enhancement | CodeCode Available | 1 |
| DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration | May 29, 2025 | | CodeCode Available | 1 |
| Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting | May 29, 2025 | 3D Scene ReconstructionGPU | CodeCode Available | 1 |
| SAMamba: Adaptive State Space Modeling with Hierarchical Vision for Infrared Small Target Detection | May 29, 2025 | Domain Adaptationfeature selection | CodeCode Available | 1 |
| To Trust Or Not To Trust Your Vision-Language Model's Prediction | May 29, 2025 | Transfer Learning | CodeCode Available | 1 |
| FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing | May 29, 2025 | | CodeCode Available | 1 |
| Model Immunization from a Condition Number Perspective | May 29, 2025 | model | CodeCode Available | 1 |
| The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning | May 29, 2025 | Federated Learning | CodeCode Available | 1 |
| PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling | May 29, 2025 | Video Understanding | CodeCode Available | 1 |
| Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing | May 29, 2025 | Optical Flow EstimationVideo Editing | CodeCode Available | 1 |
| Sentinel: Attention Probing of Proxy Models for LLM Context Compression with an Understanding Perspective | May 29, 2025 | DecoderRAG | CodeCode Available | 1 |
| Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models | May 29, 2025 | 2k4k | CodeCode Available | 1 |
| Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation | May 29, 2025 | Motion Generation | CodeCode Available | 1 |
| AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity | May 29, 2025 | | CodeCode Available | 1 |
| Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics Discovery | May 29, 2025 | Computational Efficiency | CodeCode Available | 1 |
| MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation | May 29, 2025 | Motion GenerationVideo Generation | CodeCode Available | 1 |
| K^2VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting | May 29, 2025 | Decision MakingProbabilistic Time Series Forecasting | CodeCode Available | 1 |
| Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering | May 29, 2025 | Reinforcement Learning (RL) | CodeCode Available | 1 |
| DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers | May 29, 2025 | Metric Learningparameter-efficient fine-tuning | CodeCode Available | 1 |
| VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning? | May 29, 2025 | Video Understanding | CodeCode Available | 1 |
| Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning | May 29, 2025 | DiagnosticQuestion Answering | CodeCode Available | 1 |
| Improving the Effective Receptive Field of Message-Passing Neural Networks | May 29, 2025 | Graph ClassificationGraph Regression | CodeCode Available | 1 |
| Table-R1: Inference-Time Scaling for Table Reasoning | May 29, 2025 | Fact Verification | CodeCode Available | 1 |
| Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start | May 28, 2025 | MathMultimodal Reasoning | CodeCode Available | 1 |
| Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation | May 28, 2025 | image-classificationImage Classification | CodeCode Available | 1 |
| Neuromorphic Sequential Arena: A Benchmark for Neuromorphic Temporal Processing | May 28, 2025 | | CodeCode Available | 1 |
| RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments | May 28, 2025 | BenchmarkingRed Teaming | CodeCode Available | 1 |
| Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning | May 28, 2025 | Federated Learning | CodeCode Available | 1 |
| LoKI: Low-damage Knowledge Implanting of Large Language Models | May 28, 2025 | parameter-efficient fine-tuning | CodeCode Available | 1 |
| VidText: Towards Comprehensive Evaluation for Video Text Understanding | May 28, 2025 | Multimodal ReasoningOptical Character Recognition (OCR) | CodeCode Available | 1 |
| Analysis and Evaluation of Synthetic Data Generation in Speech Dysfluency Detection | May 28, 2025 | DiversitySynthetic Data Generation | CodeCode Available | 1 |
| Fast Isotropic Median Filtering | May 28, 2025 | Allimage smoothing | CodeCode Available | 1 |
| UniTalk: Towards Universal Active Speaker Detection in Real World Scenarios | May 28, 2025 | Active Speaker Detection | CodeCode Available | 1 |
| Measuring Sycophancy of Language Models in Multi-turn Dialogues | May 28, 2025 | | CodeCode Available | 1 |
| Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design | May 28, 2025 | GPUQuantization | CodeCode Available | 1 |
| CSI-Bench: A Large-Scale In-the-Wild Dataset for Multi-task WiFi Sensing | May 28, 2025 | Multi-Task LearningPrivacy Preserving | CodeCode Available | 1 |