| Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models | May 18, 2023 | MuJoCoOffline RL | —Unverified | 0 | 0 |
| Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning | Feb 6, 2025 | Dataset GenerationMuJoCo | —Unverified | 0 | 0 |
| Benchmarking the Sim-to-Real Gap in Cloth Manipulation | Oct 14, 2023 | BenchmarkingMuJoCo | —Unverified | 0 | 0 |
| Beyond Conservatism: Diffusion Policies in Offline Multi-agent Reinforcement Learning | Jul 4, 2023 | Data AugmentationDiversity | —Unverified | 0 | 0 |
| Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration | Jun 25, 2025 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning | Apr 2, 2025 | MuJoCoUncertainty Quantification | —Unverified | 0 | 0 |
| Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis | Jun 17, 2022 | MuJoCoStarcraft | —Unverified | 0 | 0 |
| Biased Estimates of Advantages over Path Ensembles | Sep 15, 2019 | Atari Gamescontinuous-control | —Unverified | 0 | 0 |
| BlockPuzzle - A Challenge in Physical Reasoning and Generalization for Robot Learning | Nov 30, 2018 | Imitation LearningMuJoCo | —Unverified | 0 | 0 |
| Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States | Oct 1, 2022 | continuous-controlContinuous Control | —Unverified | 0 | 0 |