R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning May 5, 2025 Reinforcement Learning (RL)
Code Code Available 3EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning May 5, 2025 Ensemble Learning Large Language Model
Code Code Available 0Automated Hybrid Reward Scheduling via Large Language Models for Robotic Skill Learning May 5, 2025 Reinforcement Learning (RL) Scheduling
— Unverified 0Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study May 4, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 0Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning May 4, 2025 Reinforcement Learning (RL) Retrieval
— Unverified 0A Generalised and Adaptable Reinforcement Learning Stopping Method May 3, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 0World Model-Based Learning for Long-Term Age of Information Minimization in Vehicular Networks May 3, 2025 Reinforcement Learning (RL) Scheduling
— Unverified 0Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning May 3, 2025 D4RL Offline RL
— Unverified 0Stabilizing Temporal Difference Learning via Implicit Stochastic Recursion May 2, 2025 Computational Efficiency Off-policy evaluation
— Unverified 0Directly Forecasting Belief for Reinforcement Learning with Delays May 1, 2025 D4RL MuJoCo
Code Code Available 0A General Approach of Automated Environment Design for Learning the Optimal Power Flow May 1, 2025 Hyperparameter Optimization Reinforcement Learning (RL)
— Unverified 0Implicit Neural-Representation Learning for Elastic Deformable-Object Manipulations May 1, 2025 Deformable Object Manipulation Reinforcement Learning (RL)
— Unverified 0MULE: Multi-terrain and Unknown Load Adaptation for Effective Quadrupedal Locomotion May 1, 2025 Model Predictive Control Reinforcement Learning (RL)
— Unverified 0SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation May 1, 2025 Hallucination Navigate
Code Code Available 0T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT May 1, 2025 Image Generation Reinforcement Learning (RL)
Code Code Available 4Leveraging Partial SMILES Validation Scheme for Enhanced Drug Design in Reinforcement Learning Frameworks May 1, 2025 Deep Reinforcement Learning Drug Design
— Unverified 0Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models Apr 30, 2025 Multimodal Reasoning Reinforcement Learning (RL)
— Unverified 0Enhancing New-item Fairness in Dynamic Recommender Systems Apr 30, 2025 Fairness Knowledge Distillation
Code Code Available 0Phi-4-reasoning Technical Report Apr 30, 2025 Math Reinforcement Learning (RL)
— Unverified 0Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math Apr 30, 2025 Math Reinforcement Learning (RL)
— Unverified 0Adaptive 3D UI Placement in Mixed Reality Using Deep Reinforcement Learning Apr 30, 2025 Deep Reinforcement Learning Mixed Reality
— Unverified 0A Survey on GUI Agents with Foundation Models Enhanced by Reinforcement Learning Apr 29, 2025 Action Generation Prompt Engineering
— Unverified 0PRISM: Projection-based Reward Integration for Scene-Aware Real-to-Sim-to-Real Transfer with Few Demonstrations Apr 29, 2025 Imitation Learning Reinforcement Learning (RL)
— Unverified 0Token-Efficient RL for LLM Reasoning Apr 29, 2025 Policy Gradient Methods Reinforcement Learning (RL)
— Unverified 0Reinforcement Learning-Based Heterogeneous Multi-Task Optimization in Semantic Broadcast Communications Apr 28, 2025 image-classification Image Classification
— Unverified 0AI Recommendation Systems for Lane-Changing Using Adherence-Aware Reinforcement Learning Apr 28, 2025 Autonomous Driving Recommendation Systems
— Unverified 0Rulebook: bringing co-routines to reinforcement learning environments Apr 28, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 2Interactive Double Deep Q-network: Integrating Human Interventions and Evaluative Predictions in Reinforcement Learning of Autonomous Driving Apr 28, 2025 Autonomous Driving Q-Learning
— Unverified 0An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination Apr 28, 2025 Code Generation Hallucination
— Unverified 0LLMs for Engineering: Teaching Models to Design High Powered Rockets Apr 27, 2025 Reinforcement Learning (RL)
— Unverified 0BQSched: A Non-intrusive Scheduler for Batch Concurrent Queries via Reinforcement Learning Apr 27, 2025 Reinforcement Learning (RL) Scheduling
Code Code Available 0Neurophysiologically Realistic Environment for Comparing Adaptive Deep Brain Stimulation Algorithms in Parkinson Disease Apr 26, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 1Explainable AI for UAV Mobility Management: A Deep Q-Network Approach for Handover Minimization Apr 25, 2025 Management Reinforcement Learning (RL)
— Unverified 0Depth-Constrained ASV Navigation with Deep RL and Limited Sensing Apr 25, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0LLM-hRIC: LLM-empowered Hierarchical RAN Intelligent Control for O-RAN Apr 25, 2025 Management Reinforcement Learning (RL)
— Unverified 0CaRL: Learning Scalable Planning Policies with Simple Rewards Apr 24, 2025 Autonomous Driving CARLA longest6
Code Code Available 2RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning Apr 24, 2025 Decision Making Reinforcement Learning (RL)
Code Code Available 7Training Large Language Models to Reason via EM Policy Gradient Apr 24, 2025 GSM8K Math
— Unverified 0SAPO-RL: Sequential Actuator Placement Optimization for Fuselage Assembly via Reinforcement Learning Apr 24, 2025 Decision Making Q-Learning
— Unverified 0Integrating Learning-Based Manipulation and Physics-Based Locomotion for Whole-Body Badminton Robot Control Apr 24, 2025 Imitation Learning Reinforcement Learning (RL)
— Unverified 0Monte Carlo Planning with Large Language Model for Text-Based Game Agents Apr 23, 2025 Language Modeling Language Modelling
— Unverified 0Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator Apr 23, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 0Data-Assimilated Model-Based Reinforcement Learning for Partially Observed Chaotic Flows Apr 23, 2025 Model-based Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Natural Policy Gradient for Average Reward Non-Stationary RL Apr 23, 2025 Reinforcement Learning (RL)
— Unverified 0Hybrid Reinforcement Learning and Model Predictive Control for Adaptive Control of Hydrogen-Diesel Dual-Fuel Combustion Apr 23, 2025 Model Predictive Control Reinforcement Learning (RL)
— Unverified 0Reinforcement learning framework for the mechanical design of microelectronic components under multiphysics constraints Apr 23, 2025 global-optimization reinforcement-learning
— Unverified 0TTRL: Test-Time Reinforcement Learning Apr 22, 2025 Math reinforcement-learning
Code Code Available 7Tina: Tiny Reasoning Models via LoRA Apr 22, 2025 Reinforcement Learning (RL)
Code Code Available 3SLiM-Gym: Reinforcement Learning for Population Genetics Apr 22, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Policy-Based Radiative Transfer: Solving the 2-Level Atom Non-LTE Problem using Soft Actor-Critic Reinforcement Learning Apr 22, 2025 Reinforcement Learning (RL)
— Unverified 0