| Benchmarking Vision-Language Models on Optical Character Recognition in Dynamic Video Environments | Feb 10, 2025 | BenchmarkingOptical Character Recognition | CodeCode Available | 1 | 5 |
| A Critical Assessment of State-of-the-Art in Entity Alignment | Oct 30, 2020 | BenchmarkingEntity Alignment | CodeCode Available | 1 | 5 |
| BEND: Benchmarking DNA Language Models on biologically meaningful tasks | Nov 21, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| EntQA: Entity Linking as Question Answering | Oct 5, 2021 | BenchmarkingEntity Linking | CodeCode Available | 1 | 5 |
| ClinicRealm: Re-evaluating Large Language Models with Conventional Machine Learning for Non-Generative Clinical Prediction Tasks | Jul 26, 2024 | BenchmarkingModel Selection | CodeCode Available | 1 | 5 |
| Is Multi-Hop Reasoning Really Explainable? Towards Benchmarking Reasoning Interpretability | Apr 14, 2021 | BenchmarkingLink Prediction | CodeCode Available | 1 | 5 |
| ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems | Mar 19, 2024 | Benchmarkingfeature selection | CodeCode Available | 1 | 5 |
| ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition | Oct 24, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scale | Dec 4, 2021 | BenchmarkingHyperparameter Optimization | CodeCode Available | 1 | 5 |
| AQuA: A Benchmarking Tool for Label Quality Assessment | Jun 15, 2023 | BenchmarkingLabel Error Detection | CodeCode Available | 1 | 5 |