Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Feb 10, 2025 Math Mathematical Reasoning
Code Code Available 25 DRLE: Decentralized Reinforcement Learning at the Edge for Traffic Light Control in the IoV Sep 3, 2020 Edge-computing Management
Code Code Available 25 Direct Multi-Turn Preference Optimization for Language Agents Jun 21, 2024 Reinforcement Learning (RL)
Code Code Available 25 Digi-Q: Learning Q-Value Functions for Training Device-Control Agents Feb 13, 2025 Q-Learning Reinforcement Learning (RL)
Code Code Available 25 A Critical Evaluation of AI Feedback for Aligning Large Language Models Feb 19, 2024 Instruction Following reinforcement-learning
Code Code Available 25 FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation Apr 19, 2024 Decoder Network Embedding
Code Code Available 25 Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning Aug 12, 2022 D4RL Offline RL
Code Code Available 25 Distributional Soft Actor-Critic with Three Refinements Oct 9, 2023 Decision Making Reinforcement Learning (RL)
Code Code Available 25 Foundation Policies with Hilbert Representations Feb 23, 2024 Reinforcement Learning (RL) Unsupervised Pre-training
Code Code Available 25 FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation May 22, 2023 Imitation Learning Motion Planning
Code Code Available 25 Assessment of Reinforcement Learning for Macro Placement Feb 21, 2023 Deep Reinforcement Learning reinforcement-learning
Code Code Available 25 Diffusion Actor-Critic with Entropy Regulator May 24, 2024 Decision Making MuJoCo
Code Code Available 25 A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges Nov 12, 2022 reinforcement-learning Reinforcement Learning
Code Code Available 25 A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning Aug 16, 2022 Deep Reinforcement Learning reinforcement-learning
Code Code Available 25 RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control Jun 6, 2023 continuous-control Continuous Control
Code Code Available 25 Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory May 25, 2023 Common Sense Reasoning CPU
Code Code Available 25 A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data Jul 23, 2024 Autonomous Driving Autonomous Racing
Code Code Available 25 Gradient Boosting Reinforcement Learning Jul 11, 2024 GPU reinforcement-learning
Code Code Available 25 DiffMimic: Efficient Motion Mimicking with Differentiable Physics Apr 6, 2023 reinforcement-learning Reinforcement Learning (RL)
Code Code Available 25 Habitat 2.0: Training Home Assistants to Rearrange their Habitat Jun 28, 2021 Deep Reinforcement Learning GPU
Code Code Available 25 Heterogeneous Multi-Robot Reinforcement Learning Jan 17, 2023 Graph Neural Network Multi-agent Reinforcement Learning
Code Code Available 25 Aligning AI With Shared Human Values Aug 5, 2020 Ethics reinforcement-learning
Code Code Available 25 Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization May 25, 2024 continuous-control Continuous Control
Code Code Available 25 Developing A Multi-Agent and Self-Adaptive Framework with Deep Reinforcement Learning for Dynamic Portfolio Risk Management Feb 1, 2024 Deep Reinforcement Learning Management
Code Code Available 25 Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving May 12, 2025 Math Mathematical Problem-Solving
Code Code Available 25 Agent models: Internalizing Chain-of-Action Generation into Reasoning models Mar 9, 2025 Action Generation Reinforcement Learning (RL)
Code Code Available 25 iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement Jul 8, 2024 Language Modeling Language Modelling
Code Code Available 25 Dialogue Learning With Human-In-The-Loop Nov 29, 2016 Question Answering reinforcement-learning
Code Code Available 25 In-Hand Object Rotation via Rapid Motor Adaptation Oct 10, 2022 Object Reinforcement Learning (RL)
Code Code Available 25 Integrating Reinforcement Learning with Foundation Models for Autonomous Robotics: Methods and Perspectives Oct 21, 2024 Reinforcement Learning (RL)
Code Code Available 25 A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning Aug 5, 2022 reinforcement-learning Reinforcement Learning
Code Code Available 25 AGILE: A Novel Reinforcement Learning Framework of LLM Agents May 23, 2024 Question Answering reinforcement-learning
Code Code Available 25 DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems May 30, 2022 Diversity reinforcement-learning
Code Code Available 25 Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Feb 15, 2024 All Decision Making
Code Code Available 25 DIAMBRA Arena: a New Reinforcement Learning Platform for Research and Experimentation Oct 19, 2022 Deep Reinforcement Learning Imitation Learning
Code Code Available 25 Diffusion Models for Reinforcement Learning: A Survey Nov 2, 2023 reinforcement-learning Reinforcement Learning
Code Code Available 25 DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation May 12, 2025 Language Modeling Language Modelling
Code Code Available 25 Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models May 15, 2025 Math reinforcement-learning
Code Code Available 25 Language Models can Solve Computer Tasks Mar 30, 2023 Language Modelling Large Language Model
Code Code Available 25 LLMLight: Large Language Models as Traffic Signal Control Agents Dec 26, 2023 Decision Making Management
Code Code Available 25 A Review of Safe Reinforcement Learning: Methods, Theory and Applications May 20, 2022 Autonomous Driving Decision Making
Code Code Available 25 Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching Dec 16, 2020 Combinatorial Optimization Decision Making
Code Code Available 25 Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Feb 24, 2025 GSM8K Math
Code Code Available 25 Learning to Fly -- a Gym Environment with PyBullet Physics for Reinforcement Learning of Multi-agent Quadcopter Control Mar 3, 2021 Benchmarking Multi-agent Reinforcement Learning
Code Code Available 25 Learning to Predict Without Looking Ahead: World Models Without Forward Prediction Oct 29, 2019 Model-based Reinforcement Learning reinforcement-learning
Code Code Available 25 ARPO:End-to-End Policy Optimization for GUI Agents with Experience Replay May 22, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 25 Deep Reinforcement Learning for Multi-Agent Interaction Aug 2, 2022 BIG-bench Machine Learning Causal Inference
Code Code Available 25 DayDreamer: World Models for Physical Robot Learning Jun 28, 2022 Deep Reinforcement Learning Navigate
Code Code Available 25 D4RL: Datasets for Deep Data-Driven Reinforcement Learning Apr 15, 2020 D4RL Offline RL
Code Code Available 25 Decoupling Representation Learning from Reinforcement Learning Sep 14, 2020 Data Augmentation Deep Reinforcement Learning
Code Code Available 25