| Evaluating AI Recruitment Sourcing Tools by Human Preference | Apr 3, 2025 | Benchmarking | CodeCode Available | 0 |
| EvalAI: Towards Better Evaluation Systems for AI Agents | Feb 10, 2019 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 0 |
| Essential guidelines for computational method benchmarking | Dec 3, 2018 | Benchmarking | CodeCode Available | 0 |
| Benchmarking of LSTM Networks | Aug 11, 2015 | Benchmarking | CodeCode Available | 0 |
| NerveNet: Learning Structured Policy with Graph Neural Networks | Jan 1, 2018 | Benchmarkingcontinuous-control | CodeCode Available | 0 |
| How Fragile is Relation Extraction under Entity Replacements? | May 22, 2023 | BenchmarkingCausal Inference | CodeCode Available | 0 |
| Benchmarking Network Embedding Models for Link Prediction: Are We Making Progress? | Feb 25, 2020 | BenchmarkingLink Prediction | CodeCode Available | 0 |
| Sequence-Aware Recommender Systems | Feb 23, 2018 | BenchmarkingMatrix Completion | CodeCode Available | 0 |
| WCEbleedGen: A wireless capsule endoscopy dataset and its benchmarking for automatic bleeding classification, detection, and segmentation | Aug 22, 2024 | BenchmarkingClassification | CodeCode Available | 0 |
| Enterprise Benchmarks for Large Language Model Evaluation | Oct 11, 2024 | BenchmarkingLanguage Model Evaluation | CodeCode Available | 0 |