| Knowledge Enhanced Conditional Imputation for Healthcare Time-series | Dec 27, 2023 | BenchmarkingImputation | CodeCode Available | 0 | 5 |
| SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios | Mar 8, 2025 | BenchmarkingDiagnostic | CodeCode Available | 0 | 5 |
| Towards Enhancing Fault Tolerance in Neural Networks | Jul 6, 2019 | Benchmarking | CodeCode Available | 0 | 5 |
| KhabarChin: Automatic Detection of Important News in the Persian Language | Dec 6, 2023 | ArticlesBenchmarking | CodeCode Available | 0 | 5 |
| AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge | Dec 18, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 0 | 5 |
| Ants can orienteer a thief in their robbery | Apr 15, 2020 | BenchmarkingCombinatorial Optimization | CodeCode Available | 0 | 5 |
| Knowing-how & Knowing-that: A New Task for Machine Comprehension of User Manuals | Jun 7, 2023 | BenchmarkingMachine Reading Comprehension | CodeCode Available | 0 | 5 |
| Benchmarking Educational Program Repair | May 8, 2024 | BenchmarkingProgram Repair | CodeCode Available | 0 | 5 |
| ANTHROPOS-V: benchmarking the novel task of Crowd Volume Estimation | Jan 3, 2025 | BenchmarkingCrowd Counting | CodeCode Available | 0 | 5 |
| Adversarial Environment Generation for Learning to Navigate the Web | Mar 2, 2021 | BenchmarkingDecision Making | CodeCode Available | 0 | 5 |