| CoDEx: A Comprehensive Knowledge Graph Completion Benchmark | Sep 16, 2020 | BenchmarkingKnowledge Graph Completion | CodeCode Available | 1 |
| Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents | Feb 27, 2025 | Benchmarking | CodeCode Available | 1 |
| Benchmarking Reinforcement Learning Techniques for Autonomous Navigation | Oct 10, 2022 | Autonomous NavigationBenchmarking | CodeCode Available | 1 |
| Benchmarking emergency department triage prediction models with machine learning and large public electronic health records | Nov 22, 2021 | Benchmarking | CodeCode Available | 1 |
| 4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs | Apr 28, 2024 | Benchmarking | CodeCode Available | 1 |
| Benchmarking the Performance of Bayesian Optimization across Multiple Experimental Materials Science Domains | May 23, 2021 | Active LearningBayesian Optimisation | CodeCode Available | 1 |
| Graph Neural Network-Based Anomaly Detection for River Network Systems | Apr 19, 2023 | Anomaly DetectionBenchmarking | CodeCode Available | 1 |
| Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning | Nov 8, 2021 | Adversarial RobustnessBenchmarking | CodeCode Available | 1 |
| A framework for benchmarking clustering algorithms | Sep 20, 2022 | BenchmarkingClustering | CodeCode Available | 1 |
| CodeUpdateArena: Benchmarking Knowledge Editing on API Updates | Jul 8, 2024 | Benchmarkingknowledge editing | CodeCode Available | 1 |