| MiLiC-Eval: Benchmarking Multilingual LLMs for China's Minority Languages | Mar 3, 2025 | Benchmarking | CodeCode Available | 0 |
| The LOCATA Challenge: Acoustic Source Localization and Tracking | Sep 3, 2019 | BenchmarkingSound Source Localization | CodeCode Available | 0 |
| Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion Collider | Apr 26, 2025 | BenchmarkingGPU | CodeCode Available | 0 |
| A Meta-Analysis of the Anomaly Detection Problem | Mar 3, 2015 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| On the Measure of Intelligence | Nov 5, 2019 | ARCBenchmarking | CodeCode Available | 0 |
| Generalization and Regularization in DQN | Sep 29, 2018 | Atari GamesBenchmarking | CodeCode Available | 0 |
| Automatic Resolution of Domain Name Disputes | Nov 1, 2021 | Benchmarking | CodeCode Available | 0 |
| Mind the XAI Gap: A Human-Centered LLM Framework for Democratizing Explainable AI | Jun 13, 2025 | BenchmarkingIn-Context Learning | CodeCode Available | 0 |
| Automatic benchmarking of large multimodal models via iterative experiment programming | Jun 18, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| GenderBench: Evaluation Suite for Gender Biases in LLMs | May 17, 2025 | Benchmarking | CodeCode Available | 0 |