DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training Apr 13, 2025 Reinforcement Learning (RL)
Code Code Available 1Adaptive Insurance Reserving with CVaR-Constrained Reinforcement Learning under Macroeconomic Regimes Apr 13, 2025 Reinforcement Learning (RL)
— Unverified 0Towards More Efficient, Robust, Instance-adaptive, and Generalizable Sequential Decision making Apr 12, 2025 Decision Making Decision Making Under Uncertainty
— Unverified 0Development of a PPO-Reinforcement Learned Walking Tripedal Soft-Legged Robot using SOFA Apr 12, 2025 Reinforcement Learning (RL) Robot Navigation
Code Code Available 0Efficient Implementation of Reinforcement Learning over Homomorphic Encryption Apr 12, 2025 Privacy Preserving reinforcement-learning
— Unverified 0Towards Optimal Differentially Private Regret Bounds in Linear MDPs Apr 12, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 0Spectral Normalization for Lipschitz-Constrained Policies on Learning Humanoid Locomotion Apr 11, 2025 GPU Reinforcement Learning (RL)
— Unverified 0Optimizing Power Grid Topologies with Reinforcement Learning: A Survey of Methods and Challenges Apr 11, 2025 Decision Making Reinforcement Learning (RL)
Code Code Available 0Deep Distributional Learning with Non-crossing Quantile Network Apr 11, 2025 Distributional Reinforcement Learning quantile regression
— Unverified 0SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models Apr 10, 2025 Reinforcement Learning (RL) Visual Reasoning
Code Code Available 2RL-based Control of UAS Subject to Significant Disturbance Apr 10, 2025 Position Reinforcement Learning (RL)
— Unverified 0Deep Reinforcement Learning for Day-to-day Dynamic Tolling in Tradable Credit Schemes Apr 10, 2025 Bayesian Optimization Computational Efficiency
— Unverified 0Boosting Universal LLM Reward Design through the Heuristic Reward Observation Space Evolution Apr 10, 2025 Code Generation Reinforcement Learning (RL)
— Unverified 0Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining Apr 10, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 1VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Apr 10, 2025 Language Modeling Language Modelling
Code Code Available 9Genetic Programming with Reinforcement Learning Trained Transformer for Real-World Dynamic Scheduling Problems Apr 10, 2025 Reinforcement Learning (RL) Scheduling
— Unverified 0Perception-R1: Pioneering Perception Policy with Reinforcement Learning Apr 10, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 3Kimi-VL Technical Report Apr 10, 2025 Long-Context Understanding Mathematical Reasoning
Code Code Available 5Fast Adaptation with Behavioral Foundation Models Apr 10, 2025 Reinforcement Learning (RL)
— Unverified 0Harnessing Equivariance: Modeling Turbulence with Graph Neural Networks Apr 10, 2025 Reinforcement Learning (RL)
Code Code Available 1Better Decisions through the Right Causal World Model Apr 9, 2025 Causal Inference Model extraction
— Unverified 0Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning Apr 9, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1TW-CRL: Time-Weighted Contrastive Reward Learning for Efficient Inverse Reinforcement Learning Apr 8, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Trust-Region Twisted Policy Improvement Apr 8, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 0Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization Apr 8, 2025 Math Mathematical Reasoning
Code Code Available 2xMTF: A Formula-Free Model for Reinforcement-Learning-Based Multi-Task Fusion in Recommender Systems Apr 8, 2025 Multi-Task Learning Recommendation Systems
— Unverified 0Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems Apr 8, 2025 Imitation Learning Recommendation Systems
— Unverified 0Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models Apr 8, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0The Role of Environment Access in Agnostic Reinforcement Learning Apr 7, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning Apr 7, 2025 Combinatorial Optimization reinforcement-learning
— Unverified 0Concise Reasoning via Reinforcement Learning Apr 7, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement Learning Apr 7, 2025 Reinforcement Learning (RL) Traffic Signal Control
Code Code Available 1Physics-informed Modularized Neural Network for Advanced Building Control by Deep Reinforcement Learning Apr 7, 2025 Deep Reinforcement Learning Physics-informed machine learning
— Unverified 0Impact of Price Inflation on Algorithmic Collusion Through Reinforcement Learning Agents Apr 5, 2025 Reinforcement Learning (RL)
— Unverified 0OrbitZoo: Multi-Agent Reinforcement Learning Environment for Orbital Dynamics Apr 5, 2025 Collision Avoidance Multi-agent Reinforcement Learning
— Unverified 0Decision SpikeFormer: Spike-Driven Transformer for Decision Making Apr 4, 2025 D4RL Decision Making
— Unverified 0Algorithmic Prompt Generation for Diverse Human-like Teaming and Communication with Large Language Models Apr 4, 2025 Reinforcement Learning (RL)
— Unverified 0Improving Mixed-Criticality Scheduling with Reinforcement Learning Apr 4, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Offline and Distributional Reinforcement Learning for Wireless Communications Apr 4, 2025 Distributional Reinforcement Learning Management
— Unverified 0DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments Apr 4, 2025 Navigate Prompt Engineering
Code Code Available 4Enhanced Penalty-based Bidirectional Reinforcement Learning Algorithms Apr 4, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Dexterous Manipulation through Imitation Learning: A Survey Apr 4, 2025 Imitation Learning Reinforcement Learning (RL)
— Unverified 0Learning Dual-Arm Coordination for Grasping Large Flat Objects Apr 4, 2025 Deep Reinforcement Learning reinforcement-learning
— Unverified 0Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Apr 3, 2025 Reinforcement Learning (RL) Visual Reasoning
Code Code Available 2Adapting World Models with Latent-State Dynamics Residuals Apr 3, 2025 MuJoCo Reinforcement Learning (RL)
— Unverified 0Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models Apr 3, 2025 GSM8K Reinforcement Learning (RL)
Code Code Available 0Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Apr 3, 2025 Reinforcement Learning (RL)
Code Code Available 3MAD: A Magnitude And Direction Policy Parametrization for Stability Constrained Reinforcement Learning Apr 3, 2025 Reinforcement Learning (RL)
Code Code Available 0Inference-Time Scaling for Generalist Reward Modeling Apr 3, 2025 Reinforcement Learning (RL)
— Unverified 0Integrating Human Knowledge Through Action Masking in Reinforcement Learning for Operations Research Apr 3, 2025 Management Reinforcement Learning (RL)
— Unverified 0