| Benchmarking Clinical Decision Support Search | Jan 29, 2018 | ArticlesBenchmarking | —Unverified | 0 |
| Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models | Feb 17, 2025 | Benchmarking | —Unverified | 0 |
| Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition | Jan 14, 2025 | Activity RecognitionBenchmarking | —Unverified | 0 |
| An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis | Dec 8, 2023 | BenchmarkingQuantization | —Unverified | 0 |
| Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies | Mar 10, 2025 | BenchmarkingEthics | —Unverified | 0 |
| A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection | Jun 5, 2024 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| ABOUT ML: Annotation and Benchmarking on Understanding and Transparency of Machine Learning Lifecycles | Dec 12, 2019 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| Demographic Parity: Mitigating Biases in Real-World Data | Sep 27, 2023 | Benchmarking | —Unverified | 0 |
| CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs | Sep 9, 2024 | Benchmarkingknowledge editing | —Unverified | 0 |
| A New Stereo Benchmarking Dataset for Satellite Images | Jul 9, 2019 | Benchmarking | —Unverified | 0 |