| EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding | May 26, 2025 | Benchmarking | —Unverified | 0 |
| Europarl-ASR: A Large Corpus of Parliamentary Debates for Streaming ASR Benchmarking and Speech Data Filtering/Verbatimization | Aug 30, 2021 | BenchmarkingData Augmentation | —Unverified | 0 |
| Evalita-LLM: Benchmarking Large Language Models on Italian | Feb 4, 2025 | BenchmarkingMultiple-choice | —Unverified | 0 |
| Evaluating and Benchmarking Foundation Models for Earth Observation and Geospatial AI | Jun 26, 2024 | BenchmarkingCrop Type Mapping | —Unverified | 0 |
| Evaluating Cultural and Social Awareness of LLM Web Agents | Oct 30, 2024 | BenchmarkingNavigate | —Unverified | 0 |
| Evaluating Deep Clustering Algorithms on Non-Categorical 3D CAD Models | Apr 29, 2024 | BenchmarkingClustering | —Unverified | 0 |
| Evaluating Financial Sentiment Analysis with Annotators Instruction Assisted Prompting: Enhancing Contextual Interpretation and Stock Prediction Accuracy | May 9, 2025 | BenchmarkingSentiment Analysis | —Unverified | 0 |
| Evaluating Generative AI-Enhanced Content: A Conceptual Framework Using Qualitative, Quantitative, and Mixed-Methods Approaches | Nov 26, 2024 | Benchmarking | —Unverified | 0 |
| Evaluating Generative Models for Tabular Data: Novel Metrics and Benchmarking | Apr 29, 2025 | BenchmarkingIntrusion Detection | —Unverified | 0 |
| Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study | Aug 26, 2024 | 8kBenchmarking | —Unverified | 0 |