| BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models | Jun 2, 2023 | BenchmarkingLanguage Acquisition | CodeCode Available | 1 | 5 |
| EMGBench: Benchmarking Out-of-Distribution Generalization and Adaptation for Electromyography | Oct 31, 2024 | BenchmarkingElectromyography (EMG) | CodeCode Available | 1 | 5 |
| Delving into Out-of-Distribution Detection with Medical Vision-Language Models | Mar 2, 2025 | Benchmarkingimage-classification | CodeCode Available | 1 | 5 |
| EMPOT: partial alignment of density maps and rigid body fitting using unbalanced Gromov-Wasserstein divergence | Nov 1, 2023 | BenchmarkingCryogenic Electron Microscopy (cryo-EM) | CodeCode Available | 1 | 5 |
| End-to-end Emotion-Cause Pair Extraction via Learning to Link | Feb 25, 2020 | BenchmarkingEmotion Cause Extraction | CodeCode Available | 1 | 5 |
| DependEval: Benchmarking LLMs for Repository Dependency Understanding | Mar 9, 2025 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| Bag of Tricks for Adversarial Training | Oct 1, 2020 | Adversarial RobustnessBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Language Model Creativity: A Case Study on Code Generation | Jul 12, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain | Jun 1, 2022 | BenchmarkingEmotion Recognition | CodeCode Available | 1 | 5 |
| Benchmarking Language Models for Code Syntax Understanding | Oct 26, 2022 | Benchmarking | CodeCode Available | 1 | 5 |