Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning Apr 18, 2025 All GSM8K
— Unverified 0Improving Generalization in Intent Detection: GRPO with Reward-Based Curriculum Sampling Apr 18, 2025 Intent Detection Reinforcement Learning (RL)
— Unverified 0SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents Apr 18, 2025 Atari Games Multi-Task Learning
— Unverified 0Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration Apr 17, 2025 Data Augmentation Human-Object Interaction Detection
— Unverified 0RL-PINNs: Reinforcement Learning-Driven Adaptive Sampling for Efficient Training of PINNs Apr 17, 2025 Reinforcement Learning (RL)
— Unverified 0TraCeS: Trajectory Based Credit Assignment From Sparse Safety Feedback Apr 17, 2025 continuous-control Continuous Control
— Unverified 0LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard Apr 17, 2025 Reinforcement Learning (RL)
— Unverified 0Evolutionary Policy Optimization Apr 17, 2025 Policy Gradient Methods Reinforcement Learning (RL)
— Unverified 0Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime Apr 16, 2025 Reinforcement Learning (RL)
Code Code Available 0d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning Apr 16, 2025 Language Modeling Language Modelling
— Unverified 0pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild Apr 16, 2025 Benchmarking object-detection
— Unverified 0ToolRL: Reward is All Tool Learning Needs Apr 16, 2025 All Reinforcement Learning (RL)
Code Code Available 0Evolutionary Reinforcement Learning for Interpretable Decision-Making in Supply Chain Management Apr 16, 2025 Decision Making Management
— Unverified 0VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning Apr 16, 2025 D4RL Offline RL
— Unverified 0Hallucination-Aware Generative Pretrained Transformer for Cooperative Aerial Mobility Control Apr 15, 2025 Hallucination Reinforcement Learning (RL)
— Unverified 0Achieving Tighter Finite-Time Rates for Heterogeneous Federated Stochastic Approximation under Markovian Sampling Apr 15, 2025 Reinforcement Learning (RL)
— Unverified 0Revealing Covert Attention by Analyzing Human and Reinforcement Learning Agent Gameplay Apr 15, 2025 Reinforcement Learning (RL)
— Unverified 0Position Paper: Rethinking Privacy in RL for Sequential Decision-making in the Age of LLMs Apr 15, 2025 Autonomous Vehicles Decision Making
— Unverified 0ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Apr 15, 2025 Math Mathematical Reasoning
— Unverified 0Next-Future: Sample-Efficient Policy Learning for Robotic-Arm Tasks Apr 15, 2025 Multi-Goal Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Data driven approach towards more efficient Newton-Raphson power flow calculation for distribution grids Apr 15, 2025 Reinforcement Learning (RL)
Code Code Available 0ReZero: Enhancing LLM search ability by trying one-more-time Apr 15, 2025 Language Modeling Language Modelling
— Unverified 0Adaptive Insurance Reserving with CVaR-Constrained Reinforcement Learning under Macroeconomic Regimes Apr 13, 2025 Reinforcement Learning (RL)
— Unverified 0CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent Apr 13, 2025 Large Language Model Recommendation Systems
— Unverified 0Development of a PPO-Reinforcement Learned Walking Tripedal Soft-Legged Robot using SOFA Apr 12, 2025 Reinforcement Learning (RL) Robot Navigation
Code Code Available 0Efficient Implementation of Reinforcement Learning over Homomorphic Encryption Apr 12, 2025 Privacy Preserving reinforcement-learning
— Unverified 0Towards More Efficient, Robust, Instance-adaptive, and Generalizable Sequential Decision making Apr 12, 2025 Decision Making Decision Making Under Uncertainty
— Unverified 0Towards Optimal Differentially Private Regret Bounds in Linear MDPs Apr 12, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 0Spectral Normalization for Lipschitz-Constrained Policies on Learning Humanoid Locomotion Apr 11, 2025 GPU Reinforcement Learning (RL)
— Unverified 0Optimizing Power Grid Topologies with Reinforcement Learning: A Survey of Methods and Challenges Apr 11, 2025 Decision Making Reinforcement Learning (RL)
Code Code Available 0Deep Distributional Learning with Non-crossing Quantile Network Apr 11, 2025 Distributional Reinforcement Learning quantile regression
— Unverified 0Deep Reinforcement Learning for Day-to-day Dynamic Tolling in Tradable Credit Schemes Apr 10, 2025 Bayesian Optimization Computational Efficiency
— Unverified 0Genetic Programming with Reinforcement Learning Trained Transformer for Real-World Dynamic Scheduling Problems Apr 10, 2025 Reinforcement Learning (RL) Scheduling
— Unverified 0RL-based Control of UAS Subject to Significant Disturbance Apr 10, 2025 Position Reinforcement Learning (RL)
— Unverified 0Fast Adaptation with Behavioral Foundation Models Apr 10, 2025 Reinforcement Learning (RL)
— Unverified 0Boosting Universal LLM Reward Design through the Heuristic Reward Observation Space Evolution Apr 10, 2025 Code Generation Reinforcement Learning (RL)
— Unverified 0Better Decisions through the Right Causal World Model Apr 9, 2025 Causal Inference Model extraction
— Unverified 0Trust-Region Twisted Policy Improvement Apr 8, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 0xMTF: A Formula-Free Model for Reinforcement-Learning-Based Multi-Task Fusion in Recommender Systems Apr 8, 2025 Multi-Task Learning Recommendation Systems
— Unverified 0Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models Apr 8, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems Apr 8, 2025 Imitation Learning Recommendation Systems
— Unverified 0TW-CRL: Time-Weighted Contrastive Reward Learning for Efficient Inverse Reinforcement Learning Apr 8, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning Apr 7, 2025 Combinatorial Optimization reinforcement-learning
— Unverified 0Physics-informed Modularized Neural Network for Advanced Building Control by Deep Reinforcement Learning Apr 7, 2025 Deep Reinforcement Learning Physics-informed machine learning
— Unverified 0The Role of Environment Access in Agnostic Reinforcement Learning Apr 7, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Impact of Price Inflation on Algorithmic Collusion Through Reinforcement Learning Agents Apr 5, 2025 Reinforcement Learning (RL)
— Unverified 0OrbitZoo: Multi-Agent Reinforcement Learning Environment for Orbital Dynamics Apr 5, 2025 Collision Avoidance Multi-agent Reinforcement Learning
— Unverified 0Decision SpikeFormer: Spike-Driven Transformer for Decision Making Apr 4, 2025 D4RL Decision Making
— Unverified 0Learning Dual-Arm Coordination for Grasping Large Flat Objects Apr 4, 2025 Deep Reinforcement Learning reinforcement-learning
— Unverified 0Improving Mixed-Criticality Scheduling with Reinforcement Learning Apr 4, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0