| DACSA: A large-scale Dataset for Automatic summarization of Catalan and Spanish newspaper Articles | Jul 1, 2022 | Abstractive Text SummarizationArticles | —Unverified | 0 |
| DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes | May 22, 2025 | BenchmarkingRAG | —Unverified | 0 |
| Benchmarking and Improving Generator-Validator Consistency of Language Models | Oct 3, 2023 | BenchmarkingInstruction Following | —Unverified | 0 |
| Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place Recognition and Localization | Feb 3, 2022 | 3D ReconstructionBenchmarking | —Unverified | 0 |
| DarkBench: Benchmarking Dark Patterns in Large Language Models | Mar 13, 2025 | Benchmarking | —Unverified | 0 |
| DASB -- Discrete Audio and Speech Benchmark | Jun 20, 2024 | BenchmarkingEmotion Recognition | —Unverified | 0 |
| Data Analysis in the Era of Generative AI | Sep 27, 2024 | Benchmarking | —Unverified | 0 |
| Data and its (dis)contents: A survey of dataset development and use in machine learning research | Dec 9, 2020 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| Data Augmentation for Continual RL via Adversarial Gradient Episodic Memory | Aug 24, 2024 | BenchmarkingData Augmentation | —Unverified | 0 |
| Certifying almost all quantum states with few single-qubit measurements | Apr 10, 2024 | AllBenchmarking | —Unverified | 0 |