SOTAVerified

Atari Games

The Atari 2600 Games task (and dataset) involves training an agent to achieve high game scores.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 551600 of 625 papers

TitleStatusHype
Is Policy Learning Overrated?: Width-Based Planning and Active Learning for AtariCode0
Utilizing Evolution Strategies to Train Transformers in Reinforcement LearningCode0
When Waiting is not an Option : Learning Options with a Deliberation CostCode0
Lucid Dreaming for Experience Replay: Refreshing Past States with the Current PolicyCode0
Active inference: demystified and comparedCode0
Deep Reinforcement Learning with Swin TransformersCode0
Reusing Convolutional Activations from Frame to Frame to Speed up Training and InferenceCode0
Deep Reinforcement Learning that MattersCode0
Massively Parallel Methods for Deep Reinforcement LearningCode0
Deep reinforcement learning from human preferencesCode0
Deep Reinforcement Learning framework for Autonomous DrivingCode0
Deep Reinforcement Learning for General Video Game AICode0
Revisiting Bellman Errors for Offline Model SelectionCode0
Benchmarking Perturbation-based Saliency Maps for Explaining Atari AgentsCode0
Mean Actor CriticCode0
Revisiting Prioritized Experience Replay: A Value PerspectiveCode0
Beating the World's Best at Super Smash Bros. with Deep Reinforcement LearningCode0
Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-meansCode0
Meta-learning how to Share Credit among Macro-ActionsCode0
Beating Atari with Natural Language Guided Reinforcement LearningCode0
Revisiting the Softmax Bellman Operator: New Benefits and New PerspectiveCode0
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement LearningCode0
MICo: Improved representations via sampling-based state similarity for Markov decision processesCode0
Reward learning from human preferences and demonstrations in AtariCode0
Trial without Error: Towards Safe Reinforcement Learning via Human InterventionCode0
MinAtar: An Atari-Inspired Testbed for Thorough and Reproducible Reinforcement Learning ExperimentsCode0
Deep Quality-Value (DQV) LearningCode0
RL Unplugged: A Suite of Benchmarks for Offline Reinforcement LearningCode0
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing AtariCode0
Deep Policies for Width-Based Planning in Pixel DomainsCode0
Model-Based Reinforcement Learning for AtariCode0
RUDDER: Return Decomposition for Delayed RewardsCode0
Safe and Efficient Off-Policy Reinforcement LearningCode0
Assumed Density Filtering Q-learningCode0
Momentum-based Accelerated Q-learningCode0
Safe Option-Critic: Learning Safety in the Option-Critic ArchitectureCode0
Deep Exploration via Bootstrapped DQNCode0
Temporal Alignment for History Representation in Reinforcement LearningCode0
Temporal Regularization for Markov Decision ProcessCode0
Deep Attention Recurrent Q-NetworkCode0
Decision Transformer vs. Decision Mamba: Analysing the Complexity of Sequential Decision Making in Atari GamesCode0
Temporal Regularization in Markov Decision ProcessCode0
Multi-Game Decision TransformersCode0
Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic ControlCode0
Multi-task Deep Reinforcement Learning with PopArtCode0
Crossmodal Attentive Skill LearnerCode0
A Robust Quantile Huber Loss With Interpretable Parameter Adjustment In Distributional Reinforcement LearningCode0
CROP: Certifying Robust Policies for Reinforcement Learning through Functional SmoothingCode0
Scalable agent alignment via reward modeling: a research directionCode0
A quantum-classical reinforcement learning model to play Atari gamesCode0
Show:102550
← PrevPage 12 of 13Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GDI-I3(200M frames)Score864Unverified
2GDI-H3(200M frames)Score864Unverified
3GDI-H3Score864Unverified
4GDI-I3Score864Unverified
5Bootstrapped DQNScore855Unverified
6FQFScore854.2Unverified
7R2D2Score837.7Unverified
8Ape-XScore800.9Unverified
9Agent57Score790.4Unverified
10IMPALA (deep)Score787.34Unverified
#ModelMetricClaimedVerifiedStatus
1IQNScore34Unverified
2QR-DQN-1Score34Unverified
3TRPO-hashScore34Unverified
4NoisyNet-DuelingScore34Unverified
5GDI-H3Score34Unverified
6GDI-H3(200M frames)Score34Unverified
7GDI-I3Score34Unverified
8Go-ExploreScore34Unverified
9Bootstrapped DQNScore33.9Unverified
10C51 noopScore33.9Unverified
#ModelMetricClaimedVerifiedStatus
1Agent57Score580,328.14Unverified
2QR-DQN-1Score572,510Unverified
3R2D2Score408,850Unverified
4IMPALA (deep)Score351,200.12Unverified
5Ape-XScore302,391.3Unverified
6A2C + SILScore104,975.6Unverified
7MuZero (Res2 Adam)Score94,906.25Unverified
8DreamerV2Score94,688Unverified
9MuZeroScore72,276Unverified
10DNAScore52,398Unverified
#ModelMetricClaimedVerifiedStatus
1GDI-H3Score1,000,000Unverified
2GDI-H3(200M frames)Score1,000,000Unverified
3Agent57Score999,997.63Unverified
4R2D2Score999,996.7Unverified
5MuZeroScore999,976.52Unverified
6MuZero (Res2 Adam)Score999,659.18Unverified
7GDI-I3Score943,910Unverified
8Ape-XScore392,952.3Unverified
9C51 noopScore266,434Unverified
10Duel noopScore50,254.2Unverified