| Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception | Jun 10, 2023 | 3D Object DetectionBenchmarking | CodeCode Available | 2 |
| AiTLAS: Artificial Intelligence Toolbox for Earth Observation | Jan 21, 2022 | BenchmarkingEarth Observation | CodeCode Available | 2 |
| Datasets and Benchmarks for Offline Safe Reinforcement Learning | Jun 15, 2023 | Autonomous DrivingBenchmarking | CodeCode Available | 2 |
| Immersive Neural Graphics Primitives | Nov 24, 2022 | BenchmarkingNeRF | CodeCode Available | 2 |
| InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | Mar 5, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models | Oct 30, 2024 | Benchmarking | CodeCode Available | 2 |
| Deep Visual Geo-localization Benchmark | Apr 7, 2022 | BenchmarkingData Augmentation | CodeCode Available | 2 |
| A large annotated medical image dataset for the development and evaluation of segmentation algorithms | Feb 25, 2019 | BenchmarkingSegmentation | CodeCode Available | 2 |
| InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback | Jun 26, 2023 | BenchmarkingCode Generation | CodeCode Available | 2 |
| Craftium: An Extensible Framework for Creating Reinforcement Learning Environments | Jul 4, 2024 | BenchmarkingMinecraft | CodeCode Available | 2 |