| Identifying and Benchmarking Natural Out-of-Context Prediction Problems | Oct 25, 2021 | Benchmarking | CodeCode Available | 0 |
| Analysis | OPEN | Published: 17 June 2019 Multitask learning and benchmarking with clinical time series data | Jun 17, 2019 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 |
| IdeaBench: Benchmarking Large Language Models for Research Idea Generation | Oct 31, 2024 | Benchmarkingscientific discovery | CodeCode Available | 0 |
| IceBench: A Benchmark for Deep Learning based Sea Ice Type Classification | Mar 22, 2025 | BenchmarkingClassification | CodeCode Available | 0 |
| BioFors: A Large Biomedical Image Forensics Dataset | Aug 30, 2021 | BenchmarkingImage Forensics | CodeCode Available | 0 |
| Benchmarking Attribution Methods with Relative Feature Importance | Jul 23, 2019 | BenchmarkingFeature Importance | CodeCode Available | 0 |
| HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs | Feb 25, 2024 | BenchmarkingChatbot | CodeCode Available | 0 |
| Hyperspectral Image Dataset for Benchmarking on Salient Object Detection | Jun 29, 2018 | BenchmarkingObject | CodeCode Available | 0 |
| Long-Term Visitation Value for Deep Exploration in Sparse Reward Reinforcement Learning | Jan 1, 2020 | Benchmarkingreinforcement-learning | CodeCode Available | 0 |
| Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition | Sep 2, 2018 | Age-Invariant Face RecognitionBenchmarking | CodeCode Available | 0 |