Rendering-Aware Reinforcement Learning for Vector Graphics Generation May 27, 2025 Code Generation reinforcement-learning
— Unverified 0Breaking the Performance Ceiling in Complex Reinforcement Learning requires Inference Strategies May 27, 2025 Protein Design Reinforcement Learning (RL)
— Unverified 0Interactive OT Gym: A Reinforcement Learning-Based Interactive Optical tweezer (OT)-Driven Microrobotics Simulation Platform May 27, 2025 Reinforcement Learning (RL)
— Unverified 0Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning May 26, 2025 Denoising reinforcement-learning
Code Code Available 0VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning May 26, 2025 Large Language Model Reinforcement Learning (RL)
— Unverified 0MedDreamer: Model-Based Reinforcement Learning with Latent Imagination on Complex EHRs for Clinical Decision Support May 26, 2025 Imputation Model-based Reinforcement Learning
— Unverified 0Fox in the Henhouse: Supply-Chain Backdoor Attacks Against Reinforcement Learning May 26, 2025 Reinforcement Learning (RL)
— Unverified 0Interleaved Reasoning for Large Language Models via Reinforcement Learning May 26, 2025 Logical Reasoning Math
— Unverified 0Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition May 26, 2025 Math Reinforcement Learning (RL)
— Unverified 0Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback May 26, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model May 26, 2025 Diagnostic Reinforcement Learning (RL)
Code Code Available 0Incentivizing Reasoning from Weak Supervision May 26, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 0TeViR: Text-to-Video Reward with Diffusion Models for Efficient Reinforcement Learning May 26, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Surrogate-Assisted Evolutionary Reinforcement Learning Based on Autoencoder and Hyperbolic Neural Network May 26, 2025 Evolutionary Algorithms MuJoCo
— Unverified 0What Can RL Bring to VLA Generalization? An Empirical Study May 26, 2025 Reinforcement Learning (RL) Vision-Language-Action
— Unverified 0DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning May 26, 2025 Efficient Exploration reinforcement-learning
Code Code Available 0MT^3: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning May 26, 2025 document understanding Machine Translation
— Unverified 0Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL May 26, 2025 D4RL Offline RL
Code Code Available 0VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization May 25, 2025 Reinforcement Learning (RL)
Code Code Available 0Semi-pessimistic Reinforcement Learning May 25, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0FedORA: Resource Allocation for Federated Learning in ORAN using Radio Intelligent Controllers May 25, 2025 Federated Learning Reinforcement Learning (RL)
— Unverified 0A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning May 25, 2025 Reinforcement Learning (RL)
Code Code Available 0Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning May 25, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Reinforced Latent Reasoning for LLM-based Recommendation May 25, 2025 Recommendation Systems Reinforcement Learning (RL)
— Unverified 0TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis May 25, 2025 CPU GPU
— Unverified 0The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training May 25, 2025 Reinforcement Learning (RL) Token Reduction
— Unverified 0Hybrid Latent Reasoning via Reinforcement Learning May 24, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 0GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning May 24, 2025 GPU Offline RL
— Unverified 0Steering LLM Reasoning Through Bias-Only Adaptation May 24, 2025 GSM8K Math
— Unverified 0G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning May 24, 2025 Link Prediction Node Classification
— Unverified 0AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting May 24, 2025 GSM8K Reinforcement Learning (RL)
Code Code Available 0On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization May 24, 2025 Math Reinforcement Learning (RL)
— Unverified 0Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning May 24, 2025 Reinforcement Learning (RL)
— Unverified 0One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion May 24, 2025 Humanoid Control Motion Synthesis
— Unverified 0Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models May 24, 2025 Reinforcement Learning (RL)
Code Code Available 0WiNGPT-3.0 Technical Report May 23, 2025 Diagnostic MedQA
Code Code Available 0Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey May 23, 2025 Active Learning Reinforcement Learning (RL)
— Unverified 0Diffusion Self-Weighted Guidance for Offline Reinforcement Learning May 23, 2025 Offline RL reinforcement-learning
— Unverified 0One RL to See Them All: Visual Triple Unified Reinforcement Learning May 23, 2025 All Math
— Unverified 0Reinforcement Speculative Decoding for Fast Ranking May 23, 2025 Information Retrieval Recommendation Systems
— Unverified 0Thinking Fast and Right: Balancing Accuracy and Reasoning Length with Adaptive Rewards May 23, 2025 Reinforcement Learning (RL)
Code Code Available 0Divide-Fuse-Conquer: Eliciting "Aha Moments" in Multi-Scenario Games May 22, 2025 Reinforcement Learning (RL)
— Unverified 0Backdoors in DRL: Four Environments Focusing on In-distribution Triggers May 22, 2025 Backdoor Attack Data Poisoning
— Unverified 0VL-SAFE: Vision-Language Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving May 22, 2025 Autonomous Driving Reinforcement Learning (RL)
— Unverified 0Reward-Aware Proto-Representations in Reinforcement Learning May 22, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0RAP: Runtime-Adaptive Pruning for LLM Inference May 22, 2025 Reinforcement Learning (RL)
— Unverified 0Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning May 22, 2025 Reinforcement Learning (RL)
— Unverified 0Find the Fruit: Designing a Zero-Shot Sim2Real Deep RL Planner for Occlusion Aware Plant Manipulation May 22, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Control of Renewable Energy Communities using AI and Real-World Data May 22, 2025 Data Integration energy management
— Unverified 0AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning May 22, 2025 Math reinforcement-learning
— Unverified 0