Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling Jan 20, 2025 Imitation Learning Language Modeling
Code Code Available 25 Feedback Efficient Online Fine-Tuning of Diffusion Models Feb 26, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 25 Direct Multi-Turn Preference Optimization for Language Agents Jun 21, 2024 Reinforcement Learning (RL)
Code Code Available 25 A Critical Evaluation of AI Feedback for Aligning Large Language Models Feb 19, 2024 Instruction Following reinforcement-learning
Code Code Available 25 Digi-Q: Learning Q-Value Functions for Training Device-Control Agents Feb 13, 2025 Q-Learning Reinforcement Learning (RL)
Code Code Available 25 FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation Apr 19, 2024 Decoder Network Embedding
Code Code Available 25 Distributional Soft Actor-Critic with Three Refinements Oct 9, 2023 Decision Making Reinforcement Learning (RL)
Code Code Available 25 Diffusion Actor-Critic with Entropy Regulator May 24, 2024 Decision Making MuJoCo
Code Code Available 25 Foundation Policies with Hilbert Representations Feb 23, 2024 Reinforcement Learning (RL) Unsupervised Pre-training
Code Code Available 25 FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation May 22, 2023 Imitation Learning Motion Planning
Code Code Available 25 Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization May 25, 2024 continuous-control Continuous Control
Code Code Available 25 DIAMBRA Arena: a New Reinforcement Learning Platform for Research and Experimentation Oct 19, 2022 Deep Reinforcement Learning Imitation Learning
Code Code Available 25 GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction Feb 25, 2024 3D Reconstruction Active 3D Reconstruction
Code Code Available 25 Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory May 25, 2023 Common Sense Reasoning CPU
Code Code Available 25 A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data Jul 23, 2024 Autonomous Driving Autonomous Racing
Code Code Available 25 Gradient Boosting Reinforcement Learning Jul 11, 2024 GPU reinforcement-learning
Code Code Available 25 AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs Jul 8, 2025 GPU reinforcement-learning
Code Code Available 25 Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities Jun 22, 2025 Reinforcement Learning (RL)
Code Code Available 25 DiffMimic: Efficient Motion Mimicking with Differentiable Physics Apr 6, 2023 reinforcement-learning Reinforcement Learning (RL)
Code Code Available 25 Heterogeneous Multi-Robot Reinforcement Learning Jan 17, 2023 Graph Neural Network Multi-agent Reinforcement Learning
Code Code Available 25 Diffusion Models for Reinforcement Learning: A Survey Nov 2, 2023 reinforcement-learning Reinforcement Learning
Code Code Available 25 Honor of Kings Arena: an Environment for Generalization in Competitive Reinforcement Learning Sep 18, 2022 reinforcement-learning Reinforcement Learning
Code Code Available 25 Developing A Multi-Agent and Self-Adaptive Framework with Deep Reinforcement Learning for Dynamic Portfolio Risk Management Feb 1, 2024 Deep Reinforcement Learning Management
Code Code Available 25 DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems May 30, 2022 Diversity reinforcement-learning
Code Code Available 25 A Review of Safe Reinforcement Learning: Methods, Theory and Applications May 20, 2022 Autonomous Driving Decision Making
Code Code Available 25 iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement Jul 8, 2024 Language Modeling Language Modelling
Code Code Available 25 A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning Aug 5, 2022 reinforcement-learning Reinforcement Learning
Code Code Available 25 Interactive Differentiable Simulation May 26, 2019 Model Predictive Control parameter estimation
Code Code Available 25 Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving May 12, 2025 Math Mathematical Problem-Solving
Code Code Available 25 Agent models: Internalizing Chain-of-Action Generation into Reasoning models Mar 9, 2025 Action Generation Reinforcement Learning (RL)
Code Code Available 25 ARPO:End-to-End Policy Optimization for GUI Agents with Experience Replay May 22, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 25 Dialogue Learning With Human-In-The-Loop Nov 29, 2016 Question Answering reinforcement-learning
Code Code Available 25 Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning Aug 12, 2022 D4RL Offline RL
Code Code Available 25 Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX Jun 16, 2023 Decision Making reinforcement-learning
Code Code Available 25 DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation May 12, 2025 Language Modeling Language Modelling
Code Code Available 25 Language Models can Solve Computer Tasks Mar 30, 2023 Language Modelling Large Language Model
Code Code Available 25 Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching Dec 16, 2020 Combinatorial Optimization Decision Making
Code Code Available 25 AGILE: A Novel Reinforcement Learning Framework of LLM Agents May 23, 2024 Question Answering reinforcement-learning
Code Code Available 25 Benchmarking Deep Reinforcement Learning for Continuous Control Apr 22, 2016 Action Triplet Recognition Atari Games
Code Code Available 25 Benchmarking Potential Based Rewards for Learning Humanoid Locomotion Jul 19, 2023 Benchmarking Reinforcement Learning (RL)
Code Code Available 25 Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models May 15, 2025 Math reinforcement-learning
Code Code Available 25 Learning Physically Realizable Skills for Online Packing of General 3D Shapes Dec 5, 2022 3D geometry Action Generation
Code Code Available 25 Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models May 30, 2018 Deep Reinforcement Learning Model-based Reinforcement Learning
Code Code Available 25 Decoupling Representation Learning from Reinforcement Learning Sep 14, 2020 Data Augmentation Deep Reinforcement Learning
Code Code Available 25 DayDreamer: World Models for Physical Robot Learning Jun 28, 2022 Deep Reinforcement Learning Navigate
Code Code Available 25 Learn to Reason Efficiently with Adaptive Length-based Reward Shaping May 21, 2025 Reinforcement Learning (RL)
Code Code Available 25 Deep Reinforcement Learning for Multi-Agent Interaction Aug 2, 2022 BIG-bench Machine Learning Causal Inference
Code Code Available 25 Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot Feb 20, 2023 Efficient Exploration reinforcement-learning
Code Code Available 25 Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control May 25, 2024 continuous-control Continuous Control
Code Code Available 25 Curiosity-driven Red-teaming for Large Language Models Feb 29, 2024 Red Teaming Reinforcement Learning (RL)
Code Code Available 25