| Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models | May 18, 2023 | MuJoCoOffline RL | —Unverified | 0 | 0 |
| Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning | Feb 6, 2025 | Dataset GenerationMuJoCo | —Unverified | 0 | 0 |
| Benchmarking the Sim-to-Real Gap in Cloth Manipulation | Oct 14, 2023 | BenchmarkingMuJoCo | —Unverified | 0 | 0 |
| Beyond Conservatism: Diffusion Policies in Offline Multi-agent Reinforcement Learning | Jul 4, 2023 | Data AugmentationDiversity | —Unverified | 0 | 0 |
| Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration | Jun 25, 2025 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning | Apr 2, 2025 | MuJoCoUncertainty Quantification | —Unverified | 0 | 0 |
| Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis | Jun 17, 2022 | MuJoCoStarcraft | —Unverified | 0 | 0 |
| Biased Estimates of Advantages over Path Ensembles | Sep 15, 2019 | Atari Gamescontinuous-control | —Unverified | 0 | 0 |
| BlockPuzzle - A Challenge in Physical Reasoning and Generalization for Robot Learning | Nov 30, 2018 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States | Oct 1, 2022 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Bridging Physics-Informed Neural Networks with Reinforcement Learning: Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO) | Feb 1, 2023 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains | May 12, 2025 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning | Mar 23, 2025 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 | 0 |
| CAMEL: Continuous Action Masking Enabled by Large Language Models for Reinforcement Learning | Feb 17, 2025 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines? | Oct 27, 2020 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Careful at Estimation and Bold at Exploration | Aug 22, 2023 | MuJoCo | —Unverified | 0 | 0 |
| CasIL: Cognizing and Imitating Skills via a Dual Cognition-Action Architecture | Sep 28, 2023 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| CAT-SAC: Soft Actor-Critic with Curiosity-Aware Entropy Temperature | Jan 1, 2021 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| CEIL: Generalized Contextual Imitation Learning | Jun 26, 2023 | D4RLImitation Learning | —Unverified | 0 | 0 |
| C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory | Feb 26, 2024 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric | Oct 20, 2021 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 | 0 |
| CKNet: A Convolutional Neural Network Based on Koopman Operator for Modeling Latent Dynamics from Pixels | Feb 19, 2021 | MuJoCo | —Unverified | 0 | 0 |
| CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning | Feb 9, 2023 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Closed Loop Interactive Embodied Reasoning for Robot Manipulation | Apr 23, 2024 | MuJoCoRobot Manipulation | —Unverified | 0 | 0 |
| Coagent Networks: Generalized and Scaled | May 16, 2023 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning | Sep 25, 2019 | Decision MakingKnowledge Distillation | —Unverified | 0 | 0 |
| Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners | May 26, 2025 | MuJoCovalid | —Unverified | 0 | 0 |
| ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization | Oct 2, 2024 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 | 0 |
| Combination of Supervised and Reinforcement Learning For Vision-Based Autonomous Control | Jan 1, 2018 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| Combining Model-based and Model-free RL via Multi-step Control Variates | Jan 1, 2018 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning | Jan 14, 2022 | modelMuJoCo | —Unverified | 0 | 0 |
| Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction | May 1, 2019 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Driving | Feb 7, 2024 | Autonomous DrivingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| CoNES: Convex Natural Evolutionary Strategies | Jul 16, 2020 | BenchmarkingMuJoCo | —Unverified | 0 | 0 |
| Conservative DDPG -- Pessimistic RL without Ensemble | Mar 8, 2024 | MuJoCo | —Unverified | 0 | 0 |
| Constrained Markov Decision Processes via Backward Value Functions | Aug 26, 2020 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Context is Everything: Implicit Identification for Dynamics Adaptation | Mar 10, 2022 | MuJoCo | —Unverified | 0 | 0 |
| Contextual Conservative Q-Learning for Offline Reinforcement Learning | Jan 3, 2023 | MuJoCoQ-Learning | —Unverified | 0 | 0 |
| Contextual Transformer for Offline Meta Reinforcement Learning | Nov 15, 2022 | D4RLMeta Reinforcement Learning | —Unverified | 0 | 0 |
| Continuous Control for Searching and Planning with a Learned Model | Jun 12, 2020 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization | Apr 4, 2022 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Continuous Mean-Zero Disagreement-Regularized Imitation Learning (CMZ-DRIL) | Mar 2, 2024 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Continuous Neural Algorithmic Planners | Nov 29, 2022 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling | Nov 11, 2022 | MuJoCoNavigate | —Unverified | 0 | 0 |
| Cooperative Heterogeneous Deep Reinforcement Learning | Nov 2, 2020 | continuous-controlContinuous Control | —Unverified | 0 | 0 |
| Cooperative Multi-Agent Deep Reinforcement Learning in Content Ranking Optimization | Aug 8, 2024 | Deep Reinforcement LearningInformation Retrieval | —Unverified | 0 | 0 |
| Cross-Domain Imitation Learning with a Dual Structure | Jun 2, 2020 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| CrossNorm: On Normalization for Off-Policy Reinforcement Learning | Sep 25, 2019 | MuJoCoreinforcement-learning | —Unverified | 0 | 0 |
| Data Valuation for Offline Reinforcement Learning | May 19, 2022 | Data ValuationDeep Reinforcement Learning | —Unverified | 0 | 0 |
| DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning | Jun 26, 2020 | continuous-controlContinuous Control | —Unverified | 0 | 0 |