| The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests | Sep 22, 2024 | Benchmarking | —Unverified | 0 |
| The ACL RD-TEC: A Dataset for Benchmarking Terminology Extraction and Classification in Computational Linguistics | Aug 1, 2014 | BenchmarkingGeneral Classification | —Unverified | 0 |
| The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking | Apr 22, 2024 | BenchmarkingMisinformation | —Unverified | 0 |
| The Algonauts Project: A Platform for Communication between the Sciences of Biological and Artificial Intelligence | May 14, 2019 | Benchmarkingspeech-recognition | —Unverified | 0 |
| Language Models as a Service: Overview of a New Paradigm and its Challenges | Sep 28, 2023 | Benchmarking | —Unverified | 0 |
| The Benchmark Lottery | Jul 14, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI | Jun 1, 2023 | BenchmarkingBrain Tumor Segmentation | —Unverified | 0 |
| The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal | Sep 12, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach | Apr 27, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| The Curious Case of Integrator Reach Sets, Part I: Basic Theory | Feb 23, 2021 | Benchmarking | —Unverified | 0 |