| Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation | Apr 29, 2025 | BenchmarkingFairness | CodeCode Available | 0 | 5 |
| Adaptive Power System Emergency Control using Deep Reinforcement Learning | Mar 9, 2019 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 0 | 5 |
| BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception | Feb 7, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| Benchmarking Abstract and Reasoning Abilities Through A Theoretical Perspective | May 28, 2025 | BenchmarkingMemorization | CodeCode Available | 0 | 5 |
| InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual Illusion | May 28, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 | 5 |
| Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions | Jul 28, 2017 | Autonomous VehiclesBenchmarking | CodeCode Available | 0 | 5 |
| IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context | Mar 29, 2024 | BenchmarkingSentence | CodeCode Available | 0 | 5 |
| BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery | Jan 2, 2025 | BenchmarkingExperimental Design | CodeCode Available | 0 | 5 |
| AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies | Feb 19, 2024 | Benchmarking | CodeCode Available | 0 | 5 |
| Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture | Jun 10, 2024 | BenchmarkingDecoder | CodeCode Available | 0 | 5 |