| Geometric Deep Learning for Structure-Based Drug Design: A Survey | Jun 20, 2023 | BenchmarkingDeep Learning | CodeCode Available | 1 |
| A Comprehensive Study of the Robustness for LiDAR-based 3D Object Detectors against Adversarial Attacks | Dec 20, 2022 | 3D Object DetectionBenchmarking | CodeCode Available | 1 |
| A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models | Aug 2, 2022 | BenchmarkingSynthetic Data Generation | CodeCode Available | 1 |
| BeHonest: Benchmarking Honesty in Large Language Models | Jun 19, 2024 | BenchmarkingMisinformation | CodeCode Available | 1 |
| Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms | Sep 21, 2022 | 3D human pose and shape estimationBenchmarking | CodeCode Available | 1 |
| AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling | Nov 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 1 |
| EgoToM: Benchmarking Theory of Mind Reasoning from Egocentric Videos | Mar 28, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods | Aug 2, 2022 | BenchmarkingCausal Discovery | CodeCode Available | 1 |
| Bench4KE: Benchmarking Automated Competency Question Generation | May 30, 2025 | BenchmarkingQuestion Generation | CodeCode Available | 1 |
| ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models | Nov 29, 2021 | BenchmarkingPhysical Simulations | CodeCode Available | 1 |