What Makes a Good Diffusion Planner for Decision Making? Mar 1, 2025 Action Generation Decision Making
Code Code Available 2Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Feb 24, 2025 GSM8K Math
Code Code Available 2Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning Feb 14, 2025 Reinforcement Learning (RL) Skills Assessment
Code Code Available 2Digi-Q: Learning Q-Value Functions for Training Device-Control Agents Feb 13, 2025 Q-Learning Reinforcement Learning (RL)
Code Code Available 2Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Feb 10, 2025 Math Mathematical Reasoning
Code Code Available 2Training Language Models to Reason Efficiently Feb 6, 2025 Reinforcement Learning (RL)
Code Code Available 2CTR-Driven Advertising Image Generation with Multimodal Large Language Models Feb 5, 2025 Image Generation Reinforcement Learning (RL)
Code Code Available 2Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs Feb 4, 2025 Code Generation Language Modeling
Code Code Available 2Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning Jan 25, 2025 Answer Generation Multi-agent Reinforcement Learning
Code Code Available 2Reasoning Language Models: A Blueprint Jan 20, 2025 Reinforcement Learning (RL) Retrieval-augmented Generation
Code Code Available 2Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling Jan 20, 2025 Imitation Learning Language Modeling
Code Code Available 2Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots Jan 6, 2025 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 2Offline Reinforcement Learning for LLM Multi-Step Reasoning Dec 20, 2024 GSM8K Math
Code Code Available 2Guiding Generative Protein Language Models with Reinforcement Learning Dec 17, 2024 Diversity reinforcement-learning
Code Code Available 2Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data Dec 10, 2024 Offline RL Reinforcement Learning (RL)
Code Code Available 2ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks Dec 9, 2024 GPU Imitation Learning
Code Code Available 2Conformal Symplectic Optimization for Stable Reinforcement Learning Dec 3, 2024 Atari Games Deep Reinforcement Learning
Code Code Available 2Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective Dec 2, 2024 Density Estimation Offline RL
Code Code Available 2Pretrained LLM Adapted with LoRA as a Decision Transformer for Offline RL in Quantitative Trading Nov 26, 2024 Offline RL parameter-efficient fine-tuning
Code Code Available 2Natural Language Reinforcement Learning Nov 21, 2024 Decision Making reinforcement-learning
Code Code Available 2AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers Nov 17, 2024 In-Context Learning Meta-Learning
Code Code Available 2TIPO: Text to Image with Text Presampling for Prompt Optimization Nov 12, 2024 Image Generation Language Modeling
Code Code Available 2Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks Oct 30, 2024 General Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 2PC-Gym: Benchmark Environments For Process Control Problems Oct 29, 2024 Benchmarking Chemical Process
Code Code Available 2ODRL: A Benchmark for Off-Dynamics Reinforcement Learning Oct 28, 2024 Benchmarking reinforcement-learning
Code Code Available 2LongReward: Improving Long-context Large Language Models with AI Feedback Oct 28, 2024 Offline RL Reinforcement Learning (RL)
Code Code Available 2Integrating Reinforcement Learning with Foundation Models for Autonomous Robotics: Methods and Perspectives Oct 21, 2024 Reinforcement Learning (RL)
Code Code Available 2IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning Oct 19, 2024 Benchmarking Multi-agent Reinforcement Learning
Code Code Available 2Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design Oct 17, 2024 Protein Design Reinforcement Learning (RL)
Code Code Available 2Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement Oct 15, 2024 Disentanglement Inductive Bias
Code Code Available 2Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization Oct 11, 2024 GSM8K Language Modeling
Code Code Available 2VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment Oct 2, 2024 GSM8K Math
Code Code Available 2Stage-Wise Reward Shaping for Acrobatic Robots: A Constrained Multi-Objective Reinforcement Learning Approach Sep 24, 2024 Multi-Objective Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 2Training Language Models to Self-Correct via Reinforcement Learning Sep 19, 2024 HumanEval Math
Code Code Available 2Enhancing Sample Efficiency and Exploration in Reinforcement Learning through the Integration of Diffusion Models and Proximal Policy Optimization Sep 2, 2024 Diversity Offline RL
Code Code Available 2Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks Aug 20, 2024 Multi-agent Reinforcement Learning Multi-Task Learning
Code Code Available 2NAVIX: Scaling MiniGrid Environments with JAX Jul 28, 2024 CPU Deep Reinforcement Learning
Code Code Available 2A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data Jul 23, 2024 Autonomous Driving Autonomous Racing
Code Code Available 2MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning Jul 23, 2024 Benchmarking Decision Making
Code Code Available 2Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review Jul 18, 2024 Reinforcement Learning (RL)
Code Code Available 2Gradient Boosting Reinforcement Learning Jul 11, 2024 GPU reinforcement-learning
Code Code Available 2iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement Jul 8, 2024 Language Modeling Language Modelling
Code Code Available 2Craftium: An Extensible Framework for Creating Reinforcement Learning Environments Jul 4, 2024 Benchmarking Minecraft
Code Code Available 2Efficient World Models with Context-Aware Tokenization Jun 27, 2024 Deep Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 2GenRL: Multimodal-foundation world models for generalization in embodied agents Jun 26, 2024 Benchmarking Reinforcement Learning (RL)
Code Code Available 2Direct Multi-Turn Preference Optimization for Language Agents Jun 21, 2024 Reinforcement Learning (RL)
Code Code Available 2MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading Jun 20, 2024 Algorithmic Trading Decision Making
Code Code Available 2Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control May 25, 2024 continuous-control Continuous Control
Code Code Available 2Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization May 25, 2024 continuous-control Continuous Control
Code Code Available 2Diffusion Actor-Critic with Entropy Regulator May 24, 2024 Decision Making MuJoCo
Code Code Available 2