Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling Jan 20, 2025 Imitation Learning Language Modeling
Code Code Available 2Improving thermal state preparation of Sachdev-Ye-Kitaev model with reinforcement learning on quantum hardware Jan 20, 2025 Reinforcement Learning (RL)
Code Code Available 0RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems? Jan 20, 2025 Math Reinforcement Learning (RL)
— Unverified 0GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code Generation Jan 19, 2025 Bug fixing Code Completion
Code Code Available 0Solving Finite-Horizon MDPs via Low-Rank Tensors Jan 17, 2025 Reinforcement Learning (RL)
— Unverified 0PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU Jan 16, 2025 Benchmarking continuous-control
Code Code Available 0From Explainability to Interpretability: Interpretable Policies in Reinforcement Learning Via Model Explanation Jan 16, 2025 Decision Making Deep Reinforcement Learning
— Unverified 0Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Jan 16, 2025 Reinforcement Learning (RL)
— Unverified 0RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection Jan 16, 2025 Autonomous Driving Object
— Unverified 0Average-Reward Reinforcement Learning with Entropy Regularization Jan 15, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Reinforcement Learning-Enhanced Procedural Generation for Dynamic Narrative-Driven AR Experiences Jan 15, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning Jan 15, 2025 D4RL Q-Learning
— Unverified 0Decision Transformers for RIS-Assisted Systems with Diffusion Model-Based Channel Acquisition Jan 14, 2025 Denoising Imputation
— Unverified 0CHEQ-ing the Box: Safe Variable Impedance Learning for Robotic Polishing Jan 14, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 0FDPP: Fine-tune Diffusion Policy with Human Preference Jan 14, 2025 Imitation Learning Reinforcement Learning (RL)
— Unverified 0Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving Jan 14, 2025 Attribute Autonomous Driving
— Unverified 0Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data Jan 13, 2025 Imitation Learning MuJoCo
Code Code Available 0Future-Conditioned Recommendations with Multi-Objective Controllable Decision Transformer Jan 13, 2025 Recommendation Systems Reinforcement Learning (RL)
— Unverified 0Combining LLM decision and RL action selection to improve RL policy for adaptive interventions Jan 13, 2025 Reinforcement Learning (RL)
— Unverified 0RbRL2.0: Integrated Reward and Policy Learning for Rating-based Reinforcement Learning Jan 13, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0Average Reward Reinforcement Learning for Wireless Radio Resource Management Jan 12, 2025 Management reinforcement-learning
— Unverified 0DRDT3: Diffusion-Refined Decision Test-Time Training Model Jan 12, 2025 D4RL Offline RL
— Unverified 0Pareto Set Learning for Multi-Objective Reinforcement Learning Jan 12, 2025 Decision Making Multi-Objective Reinforcement Learning
— Unverified 0An Empirical Study of Deep Reinforcement Learning in Continuing Tasks Jan 12, 2025 Deep Reinforcement Learning MuJoCo
Code Code Available 0Hierarchical Reinforcement Learning for Optimal Agent Grouping in Cooperative Systems Jan 11, 2025 Decision Making Hierarchical Reinforcement Learning
— Unverified 0AlgoPilot: Fully Autonomous Program Synthesis Without Human-Written Programs Jan 11, 2025 Language Modeling Language Modelling
— Unverified 0A Hybrid Framework for Reinsurance Optimization: Integrating Generative Models and Reinforcement Learning Jan 11, 2025 Computational Efficiency reinforcement-learning
Code Code Available 0Smart Imitator: Learning from Imperfect Clinical Decisions Jan 10, 2025 Imitation Learning Reinforcement Learning (RL)
Code Code Available 0Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing Jan 10, 2025 Causal Inference counterfactual
— Unverified 0From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training Jan 10, 2025 Reinforcement Learning (RL)
Code Code Available 1Real-Time Integrated Dispatching and Idle Fleet Steering with Deep Reinforcement Learning for A Meal Delivery Platform Jan 10, 2025 Deep Reinforcement Learning Fairness
— Unverified 0Investigating the Impact of Observation Space Design Choices On Training Reinforcement Learning Solutions for Spacecraft Problems Jan 10, 2025 Reinforcement Learning (RL)
— Unverified 0Diffusion Models for Smarter UAVs: Decision-Making and Modeling Jan 10, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models Jan 9, 2025 Autonomous Driving Large Language Model
— Unverified 0Deep Transfer Q-Learning for Offline Non-Stationary Reinforcement Learning Jan 8, 2025 Decision Making Inductive Learning
— Unverified 0Safe Reinforcement Learning with Minimal Supervision Jan 8, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning Jan 8, 2025 Policy Gradient Methods Reinforcement Learning (RL)
Code Code Available 0Risk-averse policies for natural gas futures trading using distributional reinforcement learning Jan 8, 2025 Distributional Reinforcement Learning energy trading
— Unverified 0Run-and-tumble chemotaxis using reinforcement learning Jan 7, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Explainable Reinforcement Learning via Temporal Policy Decomposition Jan 7, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Digital Twin Aided Channel Estimation: Zone-Specific Subspace Prediction and Calibration Jan 6, 2025 Reinforcement Learning (RL)
Code Code Available 0Learn A Flexible Exploration Model for Parameterized Action Markov Decision Processes Jan 6, 2025 Reinforcement Learning (RL)
— Unverified 0Interpretable Recognition of Fused Magnesium Furnace Working Conditions with Deep Convolutional Stochastic Configuration Networks Jan 6, 2025 Reinforcement Learning (RL)
— Unverified 0Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies Jan 6, 2025 Decision Making Deep Reinforcement Learning
Code Code Available 1Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots Jan 6, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 2AMM: Adaptive Modularized Reinforcement Model for Multi-city Traffic Signal Control Jan 5, 2025 Domain Adaptation Meta-Learning
— Unverified 0Representation Convergence: Mutual Distillation is Secretly a Form of Regularization Jan 5, 2025 Deep Reinforcement Learning Form
Code Code Available 0A New Interpretation of the Certainty-Equivalence Approach for PAC Reinforcement Learning with a Generative Model Jan 5, 2025 Reinforcement Learning (RL)
— Unverified 0SR-Reward: Taking The Path More Traveled Jan 4, 2025 D4RL Imitation Learning
— Unverified 0On the Statistical Complexity for Offline and Low-Adaptive Reinforcement Learning with Structures Jan 3, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 0