| Temporally Consistent Transformers for Video Generation | Oct 5, 2022 | MinecraftVideo Generation | CodeCode Available | 2 |
| Toward Memory-Aided World Models: Benchmarking via Spatial Consistency | May 29, 2025 | BenchmarkingMinecraft | CodeCode Available | 1 |
| MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents | May 26, 2025 | BenchmarkingMinecraft | CodeCode Available | 1 |
| Plancraft: an evaluation dataset for planning with LLM agents | Dec 30, 2024 | Decision MakingMinecraft | CodeCode Available | 1 |
| TeamCraft: A Benchmark for Multi-Modal Multi-Agent Systems in Minecraft | Dec 6, 2024 | Imitation LearningMinecraft | CodeCode Available | 1 |
| ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting | Oct 23, 2024 | Decision MakingMinecraft | CodeCode Available | 1 |
| IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents | Jul 12, 2024 | Minecraft | CodeCode Available | 1 |
| Playing NetHack with LLMs: Potential & Limitations as Zero-Shot Agents | Mar 1, 2024 | Decision MakingMinecraft | CodeCode Available | 1 |
| S-Agents: Self-organizing Agents in Open-ended Environments | Feb 7, 2024 | Minecraft | CodeCode Available | 1 |
| ReGAL: Refactoring Programs to Discover Generalizable Abstractions | Jan 29, 2024 | Date UnderstandingMath | CodeCode Available | 1 |