Nonparametric Bayesian Policy Priors for Reinforcement Learning

2010-12-01NeurIPS 2010Unverified0· sign in to hype

Finale Doshi-Velez, David Wingate, Nicholas Roy, Joshua B. Tenenbaum

Unverified — Be the first to reproduce this paper.

Abstract

We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning.

Tasks

reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Nonparametric Bayesian Policy Priors for Reinforcement Learning

Abstract

Tasks

Reproductions