SOTAVerified

Self-Imitation Learning

2018-06-14ICML 2018Code Available0· sign in to hype

Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent's past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration. Our empirical results show that SIL significantly improves advantage actor-critic (A2C) on several hard exploration Atari games and is competitive to the state-of-the-art count-based exploration methods. We also show that SIL improves proximal policy optimization (PPO) on MuJoCo tasks.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
Atari 2600 AlienA2C + SILScore2,242.2Unverified
Atari 2600 AmidarA2C + SILScore1,362Unverified
Atari 2600 AssaultA2C + SILScore1,812Unverified
Atari 2600 AsterixA2C + SILScore17,984.2Unverified
Atari 2600 AsteroidsA2C + SILScore2,259.4Unverified
Atari 2600 AtlantisA2C + SILScore3,084,781.7Unverified
Atari 2600 Bank HeistA2C + SILScore1,137.8Unverified
Atari 2600 Battle ZoneA2C + SILScore25,075Unverified
Atari 2600 Beam RiderA2C + SILScore2,366.2Unverified
Atari 2600 BowlingA2C + SILScore31.1Unverified
Atari 2600 BoxingA2C + SILScore99.6Unverified
Atari 2600 BreakoutA2C + SILScore452Unverified
Atari 2600 CentipedeA2C + SILScore7,559.5Unverified
Atari 2600 Chopper CommandA2C + SILScore6,710Unverified
Atari 2600 Crazy ClimberA2C + SILScore130,185.8Unverified
Atari 2600 Demon AttackA2C + SILScore10,140.5Unverified
Atari 2600 Double DunkA2C + SILScore21.5Unverified
Atari 2600 EnduroA2C + SILScore1,205.1Unverified
Atari 2600 Fishing DerbyA2C + SILScore55.8Unverified
Atari 2600 FreewayA2C + SILScore32.2Unverified
Atari 2600 FrostbiteA2C + SILScore6,289.8Unverified
Atari 2600 GopherA2C + SILScore23,304.2Unverified
Atari 2600 GravitarA2C + SILScore1,874.2Unverified
Atari 2600 HEROA2C + SILScore33,156.7Unverified
Atari 2600 Ice HockeyA2C + SILScore-2.4Unverified
Atari 2600 James BondA2C + SILScore310.8Unverified
Atari 2600 KangarooA2C + SILScore2,888.3Unverified
Atari 2600 KrullA2C + SILScore10,614.6Unverified
Atari 2600 Kung-Fu MasterA2C + SILScore34,449.2Unverified
Atari 2600 Montezuma's RevengeA2C + SILScore1,100Unverified
Atari 2600 Ms. PacmanA2C + SILScore4,025.1Unverified
Atari 2600 Name This GameA2C + SILScore14,958.2Unverified
Atari 2600 PongA2C + SILScore20.9Unverified
Atari 2600 Private EyeA2C + SILScore661.2Unverified
Atari 2600 Q*BertA2C + SILScore104,975.6Unverified
Atari 2600 River RaidA2C + SILScore14,306.1Unverified
Atari 2600 Road RunnerA2C + SILScore57,071.7Unverified
Atari 2600 RobotankA2C + SILScore10.5Unverified
Atari 2600 SeaquestA2C + SILScore2,456.5Unverified
Atari 2600 Space InvadersA2C + SILScore2,951.7Unverified
Atari 2600 Star GunnerA2C + SILScore31,309.2Unverified
Atari 2600 TennisA2C + SILScore-17.3Unverified
Atari 2600 Time PilotA2C + SILScore10,811.7Unverified
Atari 2600 TutankhamA2C + SILScore340.5Unverified
Atari 2600 Up and DownA2C + SILScore53,314.6Unverified
Atari 2600 VentureA2C + SILScore0Unverified
Atari 2600 Video PinballA2C + SILScore461,522.4Unverified
Atari 2600 Wizard of WorA2C + SILScore7,088.3Unverified
Atari 2600 ZaxxonA2C + SILScore9,164.2Unverified

Reproductions