SOTAVerified

Atari Games

The Atari 2600 Games task (and dataset) involves training an agent to achieve high game scores.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 501550 of 625 papers

TitleStatusHype
Maximizing Ensemble Diversity in Deep Reinforcement Learning0
Maximum Entropy Monte-Carlo Planning0
Measuring Progress in Deep Reinforcement Learning Sample Efficiency0
Learning to Constrain Policy Optimization with Virtual Trust Region0
Metaoptimization on a Distributed System for Deep Reinforcement Learning0
Metatrace Actor-Critic: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control0
Methodical Advice Collection and Reuse in Deep Reinforcement Learning0
Micro-Objective Learning : Accelerating Deep Reinforcement Learning through the Discovery of Continuous Subgoals0
Mimicking actions is a good strategy for beginners: Fast Reinforcement Learning with Expert Action Sequences0
Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy0
Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play0
Model Based Reinforcement Learning for Atari0
Model-Free Episodic Control with State Aggregation0
Modularization of End-to-End Learning: Case Study in Arcade Games0
Momentum in Reinforcement Learning0
Monte Carlo Tree Search with Scalable Simulation Periods for Continuously Running Tasks0
MTSpark: Enabling Multi-Task Learning with Spiking Neural Networks for Generalist Agents0
Multi-compartment Neuron and Population Encoding Powered Spiking Neural Network for Deep Distributional Reinforcement Learning0
Multiplayer Support for the Arcade Learning Environment0
Natural Value Approximators: Learning when to Trust Past Estimates0
Parallel Exploration via Negatively Correlated Search0
Neural Policy Style Transfer0
Neurohex: A Deep Q-learning Hex Agent0
Noisy Agents: Self-supervised Exploration by Predicting Auditory Events0
Non-Crossing Quantile Regression for Distributional Reinforcement Learning0
Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning0
Non-Robust Feature Mapping in Deep Reinforcement Learning0
No-Regret Exploration in Goal-Oriented Reinforcement Learning0
Normalization and effective learning rates in reinforcement learning0
Nuclear Norm Maximization Based Curiosity-Driven Learning0
Object-sensitive Deep Reinforcement Learning0
Objects matter: object-centric world models improve reinforcement learning in visually complex environments0
Observe and Look Further: Achieving Consistent Performance on Atari0
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints0
Off-Policy Actor-Critic with Shared Experience Replay0
On Bonus Based Exploration Methods In The Arcade Learning Environment0
On Bonus-Based Exploration Methods in the Arcade Learning Environment0
On Effective Parallelization of Monte Carlo Tree Search0
On Improving Deep Reinforcement Learning for POMDPs0
Online Meta-learning by Parallel Algorithm Competition0
On the Role of Weight Sharing During Deep Option Learning0
Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning0
P4O: Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization0
Perturbation-based exploration methods in deep reinforcement learning0
PG-Rainbow: Using Distributional Reinforcement Learning in Policy Gradient Methods0
Playing Atari Ball Games with Hierarchical Reinforcement Learning0
Policy Gradient For Multidimensional Action Spaces: Action Sampling and Entropy Bonus0
Policy Optimization with Model-based Explorations0
Population-Guided Imitation Learning0
Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning0
Show:102550
← PrevPage 11 of 13Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GDI-I3(200M frames)Score864Unverified
2GDI-H3(200M frames)Score864Unverified
3GDI-H3Score864Unverified
4GDI-I3Score864Unverified
5Bootstrapped DQNScore855Unverified
6FQFScore854.2Unverified
7R2D2Score837.7Unverified
8Ape-XScore800.9Unverified
9Agent57Score790.4Unverified
10IMPALA (deep)Score787.34Unverified
#ModelMetricClaimedVerifiedStatus
1IQNScore34Unverified
2QR-DQN-1Score34Unverified
3TRPO-hashScore34Unverified
4NoisyNet-DuelingScore34Unverified
5GDI-H3Score34Unverified
6GDI-H3(200M frames)Score34Unverified
7GDI-I3Score34Unverified
8Go-ExploreScore34Unverified
9Bootstrapped DQNScore33.9Unverified
10C51 noopScore33.9Unverified
#ModelMetricClaimedVerifiedStatus
1Agent57Score580,328.14Unverified
2QR-DQN-1Score572,510Unverified
3R2D2Score408,850Unverified
4IMPALA (deep)Score351,200.12Unverified
5Ape-XScore302,391.3Unverified
6A2C + SILScore104,975.6Unverified
7MuZero (Res2 Adam)Score94,906.25Unverified
8DreamerV2Score94,688Unverified
9MuZeroScore72,276Unverified
10DNAScore52,398Unverified
#ModelMetricClaimedVerifiedStatus
1GDI-H3Score1,000,000Unverified
2GDI-H3(200M frames)Score1,000,000Unverified
3Agent57Score999,997.63Unverified
4R2D2Score999,996.7Unverified
5MuZeroScore999,976.52Unverified
6MuZero (Res2 Adam)Score999,659.18Unverified
7GDI-I3Score943,910Unverified
8Ape-XScore392,952.3Unverified
9C51 noopScore266,434Unverified
10Duel noopScore50,254.2Unverified