| CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification | Jun 18, 2023 | BenchmarkingRetrieval | CodeCode Available | 1 |
| Contemporary Symbolic Regression Methods and their Relative Performance | Jul 29, 2021 | Benchmarkingparameter estimation | CodeCode Available | 1 |
| COVID-19 event extraction from Twitter via extractive question answering with continuous prompts | Mar 19, 2023 | BenchmarkingEvent Extraction | CodeCode Available | 1 |
| Benchmarking Robustness of Machine Reading Comprehension Models | Apr 29, 2020 | BenchmarkingMachine Reading Comprehension | CodeCode Available | 1 |
| Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets | Apr 11, 2022 | Action Triplet RecognitionBenchmarking | CodeCode Available | 1 |
| Benchmarking Robustness to Adversarial Image Obfuscations | Jan 30, 2023 | Benchmarking | CodeCode Available | 1 |
| Benchmarks for Deep Off-Policy Evaluation | Mar 30, 2021 | Benchmarkingcontinuous-control | CodeCode Available | 1 |
| CoDEx: A Comprehensive Knowledge Graph Completion Benchmark | Sep 16, 2020 | BenchmarkingKnowledge Graph Completion | CodeCode Available | 1 |
| Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents | Feb 27, 2025 | Benchmarking | CodeCode Available | 1 |
| CodeS: Natural Language to Code Repository via Multi-Layer Sketch | Mar 25, 2024 | Benchmarking | CodeCode Available | 1 |