SOTAVerified

Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines

2017-06-20NeurIPSUnverified0· sign in to hype

Philip S. Thomas, Emma Brunskill

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We show how an action-dependent baseline can be used by the policy gradient theorem using function approximation, originally presented with action-independent baselines by (Sutton et al. 2000).

Tasks

Reproductions