CTSAC: Curriculum-Based Transformer Soft Actor-Critic for Goal-Oriented Robot Exploration Mar 18, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Pauli Network Circuit Synthesis with Reinforcement Learning Mar 18, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation Mar 17, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 1Synchronous vs Asynchronous Reinforcement Learning in a Real World Robot Mar 17, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0APF+: Boosting adaptive-potential function reinforcement learning methods with a W-shaped network for high-dimensional games Mar 17, 2025 Atari Games Q-Learning
— Unverified 0A Reinforcement Learning-Driven Transformer GAN for Molecular Generation Mar 17, 2025 Drug Discovery reinforcement-learning
— Unverified 0FLEX: A Framework for Learning Robot-Agnostic Force-based Skills Involving Sustained Contact Object Manipulation Mar 17, 2025 Imitation Learning Object
— Unverified 0Dynamic Angle Selection in X-Ray CT: A Reinforcement Learning Approach to Optimal Stopping Mar 16, 2025 Computed Tomography (CT) Experimental Design
— Unverified 0TERL: Large-Scale Multi-Target Encirclement Using Transformer-Enhanced Reinforcement Learning Mar 16, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1Evaluation-Time Policy Switching for Offline Reinforcement Learning Mar 15, 2025 Behavioural cloning Offline RL
— Unverified 0Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards Mar 14, 2025 Denoising Image Generation
Code Code Available 2Sketch-to-Skill: Bootstrapping Robot Learning with Human Drawn Trajectory Sketches Mar 14, 2025 Imitation Learning reinforcement-learning
— Unverified 0Adaptive Torque Control of Exoskeletons under Spasticity Conditions via Reinforcement Learning Mar 14, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Learning to reset in target search problems Mar 14, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 0Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering Mar 14, 2025 Audio Question Answering Question Answering
Code Code Available 3Dynamic Obstacle Avoidance with Bounded Rationality Adversarial Reinforcement Learning Mar 14, 2025 Benchmarking Navigate
— Unverified 0Exploring Competitive and Collusive Behaviors in Algorithmic Pricing with Deep Reinforcement Learning Mar 14, 2025 Deep Reinforcement Learning Q-Learning
— Unverified 0Reinforcement Learning-Based Controlled Switching Approach for Inrush Current Minimization in Power Transformers Mar 14, 2025 Reinforcement Learning (RL)
— Unverified 0Scalable Evaluation of Online Facilitation Strategies via Synthetic Simulation of Discussions Mar 13, 2025 Reinforcement Learning (RL)
Code Code Available 0Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model Mar 13, 2025 Language Modeling Language Modelling
— Unverified 0SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process Mar 13, 2025 Reinforcement Learning (RL)
— Unverified 0Safe Continual Domain Adaptation after Sim2Real Transfer of Reinforcement Learning Policies in Robotics Mar 13, 2025 Continual Learning Domain Adaptation
— Unverified 0SySLLM: Generating Synthesized Policy Summaries for Reinforcement Learning Agents Using Large Language Models Mar 13, 2025 Reinforcement Learning (RL) World Knowledge
— Unverified 0H2-MARL: Multi-Agent Reinforcement Learning for Pareto Optimality in Hospital Capacity Strain and Human Mobility during Epidemic Mar 13, 2025 Multi-agent Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0DeepSeek-Inspired Exploration of RL-based LLMs and Synergy with Wireless Networks: A Survey Mar 13, 2025 Edge-computing Intelligent Communication
— Unverified 0NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models Mar 13, 2025 Imitation Learning Reinforcement Learning (RL)
— Unverified 0Solving Bayesian inverse problems with diffusion priors and off-policy RL Mar 12, 2025 Reinforcement Learning (RL)
— Unverified 0Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Mar 12, 2025 Question Answering RAG
Code Code Available 7Optimisation of the Accelerator Control by Reinforcement Learning: A Simulation-Based Approach Mar 12, 2025 Reinforcement Learning (RL)
— Unverified 0Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem Proving Mar 12, 2025 Automated Theorem Proving Reinforcement Learning (RL)
— Unverified 0Edge AI-Powered Real-Time Decision-Making for Autonomous Vehicles in Adverse Weather Conditions Mar 12, 2025 Autonomous Navigation Autonomous Vehicles
— Unverified 0Large-scale Regional Traffic Signal Control Based on Single-Agent Reinforcement Learning Mar 12, 2025 Reinforcement Learning (RL) Traffic Signal Control
— Unverified 0Unified Locomotion Transformer with Simultaneous Sim-to-Real Transfer for Quadrupeds Mar 12, 2025 Deep Reinforcement Learning Knowledge Distillation
— Unverified 0MarineGym: A High-Performance Reinforcement Learning Platform for Underwater Robotics Mar 12, 2025 Benchmarking GPU
— Unverified 0Evaluating Reinforcement Learning Safety and Trustworthiness in Cyber-Physical Systems Mar 12, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Balancing SoC in Battery Cells using Safe Action Perturbations Mar 11, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents Mar 11, 2025 Management Reinforcement Learning (RL)
— Unverified 0Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning Mar 11, 2025 Disentanglement Reinforcement Learning (RL)
— Unverified 0Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative Model Mar 11, 2025 Reinforcement Learning (RL)
— Unverified 0A Cascading Cooperative Multi-agent Framework for On-ramp Merging Control Integrating Large Language Models Mar 11, 2025 Decision Making global-optimization
— Unverified 0Zero-Shot Action Generalization with Limited Observations Mar 11, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents Mar 11, 2025 Navigate Reinforcement Learning (RL)
— Unverified 0V-Max: A Reinforcement Learning Framework for Autonomous Driving Mar 11, 2025 Autonomous Driving Decision Making
Code Code Available 2Regulatory DNA sequence Design with Reinforcement Learning Mar 11, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models Mar 11, 2025 Large Language Model Mixture-of-Experts
— Unverified 0Adaptive routing protocols for determining optimal paths in AI multi-agent systems: a priority- and learning-enhanced approach Mar 10, 2025 Reinforcement Learning (RL)
— Unverified 0Efficient Neural Clause-Selection Reinforcement Mar 10, 2025 Automated Theorem Proving CPU
— Unverified 0LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Mar 10, 2025 Logical Reasoning Multimodal Reasoning
Code Code Available 4Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Mar 10, 2025 Math Meta Reinforcement Learning
— Unverified 0VisRL: Intention-Driven Visual Perception via Reinforced Reasoning Mar 10, 2025 Reinforcement Learning (RL) Visual Reasoning
Code Code Available 1