SOTAVerified|Agents Browse Leaderboard About Blog

NetHack

Mean in-game score over 1000 episodes with random seeds not seen during training. See https://arxiv.org/abs/2006.13760 (Section 2.4 Evaluation Protocol) for details.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 28 papers

Title	Date	Tasks	Status	Hype
LuckyMera: a Modular AI Framework for Building Hybrid NetHack Agents	Jul 17, 2023	NetHack	CodeCode Available	1
Katakomba: Tools and Benchmarks for Data-Driven NetHack	Jun 14, 2023	D4RLNetHack	CodeCode Available	1
Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning	Jul 23, 2022	Inductive BiasNetHack	CodeCode Available	1
NovelD: A Simple yet Effective Exploration Criterion	Dec 1, 2021	Atari GamesDeep Reinforcement Learning	CodeCode Available	1
SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark	Oct 20, 2021	Grounded language learningNetHack	CodeCode Available	1
CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents	Oct 19, 2021	NetHackreinforcement-learning	CodeCode Available	1
BeBold: Exploration Beyond the Boundary of Explored Regions	Dec 15, 2020	Deep Reinforcement LearningEfficient Exploration	CodeCode Available	1
The NetHack Learning Environment	Jun 24, 2020	NetHackNetHack Score	CodeCode Available	1
MaestroMotif: Skill Design from Artificial Intelligence Feedback	Dec 11, 2024	Code GenerationDecision Making	—Unverified	0
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games	Nov 20, 2024	BenchmarkingNetHack	—Unverified	0

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.