| Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions? | May 7, 2025 | BenchmarkingSemantic Segmentation | CodeCode Available | 0 | 5 |
| DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise Annotations | Jan 24, 2022 | BenchmarkingDrug Discovery | CodeCode Available | 0 | 5 |
| Benchmarking Learning Efficiency in Deep Reservoir Computing | Sep 29, 2022 | Benchmarking | CodeCode Available | 0 | 5 |
| Flexible Generation of Preference Data for Recommendation Analysis | Jul 23, 2024 | BenchmarkingRecommendation Systems | CodeCode Available | 0 | 5 |
| Benchmarking Neural Machine Translation for Southern African Languages | Jun 17, 2019 | BenchmarkingMachine Translation | CodeCode Available | 0 | 5 |
| IOLBENCH: Benchmarking LLMs on Linguistic Reasoning | Jan 8, 2025 | Benchmarking | CodeCode Available | 0 | 5 |
| Geological Inference from Textual Data using Word Embeddings | Apr 10, 2025 | BenchmarkingWord Embeddings | CodeCode Available | 0 | 5 |
| Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion Collider | Apr 26, 2025 | BenchmarkingGPU | CodeCode Available | 0 | 5 |
| DQI: Measuring Data Quality in NLP | May 2, 2020 | Active LearningBenchmarking | CodeCode Available | 0 | 5 |
| Benchmarking Large Vision-Language Models on Fine-Grained Image Tasks: A Comprehensive Evaluation | Apr 21, 2025 | Benchmarking | CodeCode Available | 0 | 5 |