A Generalised and Adaptable Reinforcement Learning Stopping Method May 3, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 0Stabilizing Temporal Difference Learning via Implicit Stochastic Recursion May 2, 2025 Computational Efficiency Off-policy evaluation
— Unverified 0A General Approach of Automated Environment Design for Learning the Optimal Power Flow May 1, 2025 Hyperparameter Optimization Reinforcement Learning (RL)
— Unverified 0SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation May 1, 2025 Hallucination Navigate
Code Code Available 0MULE: Multi-terrain and Unknown Load Adaptation for Effective Quadrupedal Locomotion May 1, 2025 Model Predictive Control Reinforcement Learning (RL)
— Unverified 0Leveraging Partial SMILES Validation Scheme for Enhanced Drug Design in Reinforcement Learning Frameworks May 1, 2025 Deep Reinforcement Learning Drug Design
— Unverified 0Directly Forecasting Belief for Reinforcement Learning with Delays May 1, 2025 D4RL MuJoCo
Code Code Available 0Implicit Neural-Representation Learning for Elastic Deformable-Object Manipulations May 1, 2025 Deformable Object Manipulation Reinforcement Learning (RL)
— Unverified 0Adaptive 3D UI Placement in Mixed Reality Using Deep Reinforcement Learning Apr 30, 2025 Deep Reinforcement Learning Mixed Reality
— Unverified 0Phi-4-reasoning Technical Report Apr 30, 2025 Math Reinforcement Learning (RL)
— Unverified 0Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models Apr 30, 2025 Multimodal Reasoning Reinforcement Learning (RL)
— Unverified 0Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math Apr 30, 2025 Math Reinforcement Learning (RL)
— Unverified 0Enhancing New-item Fairness in Dynamic Recommender Systems Apr 30, 2025 Fairness Knowledge Distillation
Code Code Available 0PRISM: Projection-based Reward Integration for Scene-Aware Real-to-Sim-to-Real Transfer with Few Demonstrations Apr 29, 2025 Imitation Learning Reinforcement Learning (RL)
— Unverified 0A Survey on GUI Agents with Foundation Models Enhanced by Reinforcement Learning Apr 29, 2025 Action Generation Prompt Engineering
— Unverified 0Token-Efficient RL for LLM Reasoning Apr 29, 2025 Policy Gradient Methods Reinforcement Learning (RL)
— Unverified 0AI Recommendation Systems for Lane-Changing Using Adherence-Aware Reinforcement Learning Apr 28, 2025 Autonomous Driving Recommendation Systems
— Unverified 0Reinforcement Learning-Based Heterogeneous Multi-Task Optimization in Semantic Broadcast Communications Apr 28, 2025 image-classification Image Classification
— Unverified 0Interactive Double Deep Q-network: Integrating Human Interventions and Evaluative Predictions in Reinforcement Learning of Autonomous Driving Apr 28, 2025 Autonomous Driving Q-Learning
— Unverified 0An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination Apr 28, 2025 Code Generation Hallucination
— Unverified 0LLMs for Engineering: Teaching Models to Design High Powered Rockets Apr 27, 2025 Reinforcement Learning (RL)
— Unverified 0BQSched: A Non-intrusive Scheduler for Batch Concurrent Queries via Reinforcement Learning Apr 27, 2025 Reinforcement Learning (RL) Scheduling
Code Code Available 0Depth-Constrained ASV Navigation with Deep RL and Limited Sensing Apr 25, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0Explainable AI for UAV Mobility Management: A Deep Q-Network Approach for Handover Minimization Apr 25, 2025 Management Reinforcement Learning (RL)
— Unverified 0LLM-hRIC: LLM-empowered Hierarchical RAN Intelligent Control for O-RAN Apr 25, 2025 Management Reinforcement Learning (RL)
— Unverified 0SAPO-RL: Sequential Actuator Placement Optimization for Fuselage Assembly via Reinforcement Learning Apr 24, 2025 Decision Making Q-Learning
— Unverified 0Integrating Learning-Based Manipulation and Physics-Based Locomotion for Whole-Body Badminton Robot Control Apr 24, 2025 Imitation Learning Reinforcement Learning (RL)
— Unverified 0Training Large Language Models to Reason via EM Policy Gradient Apr 24, 2025 GSM8K Math
— Unverified 0Reinforcement learning framework for the mechanical design of microelectronic components under multiphysics constraints Apr 23, 2025 global-optimization reinforcement-learning
— Unverified 0Data-Assimilated Model-Based Reinforcement Learning for Partially Observed Chaotic Flows Apr 23, 2025 Model-based Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator Apr 23, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 0Natural Policy Gradient for Average Reward Non-Stationary RL Apr 23, 2025 Reinforcement Learning (RL)
— Unverified 0Monte Carlo Planning with Large Language Model for Text-Based Game Agents Apr 23, 2025 Language Modeling Language Modelling
— Unverified 0Hybrid Reinforcement Learning and Model Predictive Control for Adaptive Control of Hydrogen-Diesel Dual-Fuel Combustion Apr 23, 2025 Model Predictive Control Reinforcement Learning (RL)
— Unverified 0Insights from Verification: Training a Verilog Generation LLM with Reinforcement Learning with Testbench Feedback Apr 22, 2025 Code Generation Hallucination
— Unverified 0Real-Time Optimal Design of Experiment for Parameter Identification of Li-Ion Cell Electrochemical Model Apr 22, 2025 Experimental Design Reinforcement Learning (RL)
— Unverified 0StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation Apr 22, 2025 Reinforcement Learning (RL) Scheduling
— Unverified 0SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning Apr 22, 2025 Multiple-choice reinforcement-learning
— Unverified 0SLiM-Gym: Reinforcement Learning for Population Genetics Apr 22, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Policy-Based Radiative Transfer: Solving the 2-Level Atom Non-LTE Problem using Soft Actor-Critic Reinforcement Learning Apr 22, 2025 Reinforcement Learning (RL)
— Unverified 0LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning Apr 21, 2025 Language Modeling Language Modelling
— Unverified 0Dynamic Contrastive Skill Learning with State-Transition Based Skill Clustering and Dynamic Length Adjustment Apr 21, 2025 Contrastive Learning Decision Making
— Unverified 0Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL Apr 21, 2025 Reinforcement Learning (RL) Zero-Shot Learning
— Unverified 0OTC: Optimal Tool Calls via Reinforcement Learning Apr 21, 2025 Math reinforcement-learning
— Unverified 0Relation-R1: Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relational Comprehension Apr 20, 2025 Graph Generation Reinforcement Learning (RL)
— Unverified 0Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning Apr 19, 2025 Computational Efficiency Q-Learning
— Unverified 0Quantum-Enhanced Reinforcement Learning for Power Grid Security Assessment Apr 19, 2025 Computational Efficiency Navigate
— Unverified 0Unlearning Works Better Than You Think: Local Reinforcement-Based Selection of Auxiliary Objectives Apr 19, 2025 Reinforcement Learning (RL)
— Unverified 0Improving RL Exploration for LLM Reasoning through Retrospective Replay Apr 19, 2025 Code Generation Mathematical Reasoning
— Unverified 0SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents Apr 18, 2025 Atari Games Multi-Task Learning
— Unverified 0