| Simple Emergent Action Representations from Multi-Task Policy Training | Oct 18, 2022 | Deep Reinforcement LearningMuJoCo | —Unverified | 0 |
| Policy Gradient With Serial Markov Chain Reasoning | Oct 13, 2022 | Decision MakingMuJoCo | —Unverified | 0 |
| Mind's Eye: Grounded Language Model Reasoning through Simulation | Oct 11, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees | Oct 4, 2022 | counterfactualImitation Learning | —Unverified | 0 |
| Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees | Oct 4, 2022 | Imitation LearningMuJoCo | —Unverified | 0 |
| Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States | Oct 1, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies | Sep 21, 2022 | continuous-controlContinuous Control | —Unverified | 0 |
| A Computational Model of Learning Flexible Navigation in a Maze by Layout-Conforming Replay of Place Cells | Sep 18, 2022 | MuJoCo | —Unverified | 0 |
| Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations | Sep 16, 2022 | Decision MakingImitation Learning | —Unverified | 0 |
| Value Summation: A Novel Scoring Function for MPC-based Model-based Reinforcement Learning | Sep 16, 2022 | Model-based Reinforcement LearningMuJoCo | —Unverified | 0 |