Reinforcement Learning with Parameterized Actions

2015-09-05Code Available0· sign in to hype

Warwick Masson, Pravesh Ranchod, George Konidaris

Code Available — Be the first to reproduce this paper.

Code

github.com/cycraig/MP-DQN
pytorch★ 0
github.com/cycraig/gym-goal
none★ 0
github.com/cycraig/gym-platform
none★ 0

Abstract

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goal-scoring and Platform domains.

Tasks

reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)

Reinforcement Learning with Parameterized Actions

Code

Abstract

Tasks

Reproductions