Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
2018-01-04ICML 2018Code Available1· sign in to hype
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/haarnoja/sacOfficialIn papertf★ 0
- github.com/Rafael1s/Deep-Reinforcement-Learning-Udacitypytorch★ 992
- github.com/trackmania-rl/tmrlpytorch★ 687
- github.com/BY571/Soft-Actor-Critic-and-Extensionspytorch★ 295
- github.com/dasgringuen/assetto_corsa_gympytorch★ 179
- github.com/learn-to-race/l2rnone★ 174
- github.com/polixir/NeoRLnone★ 133
- github.com/FOCAL-ICLR/FOCAL-ICLRpytorch★ 55
- github.com/toshikwa/discor.pytorchpytorch★ 38
- github.com/ku2482/discor.pytorchpytorch★ 38
Abstract
A platform for Applied Reinforcement Learning (Applied RL)
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Ant-v4 | SAC | Average Return | 5,208.09 | — | Unverified |
| HalfCheetah-v4 | SAC | Average Return | 15,836.04 | — | Unverified |
| Hopper-v4 | SAC | Average Return | 2,882.56 | — | Unverified |
| Humanoid-v4 | SAC | Average Return | 6,211.5 | — | Unverified |
| Walker2d-v4 | SAC | Average Return | 5,745.27 | — | Unverified |