Distributional Reinforcement Learning
Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.
We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature
Papers
Showing 11–20 of 137 papers
No leaderboard results yet.