| Quality Assured: Rethinking Annotation Strategies in Imaging AI | Jul 24, 2024 | Benchmarking | —Unverified | 0 |
| Building a Domain-specific Guardrail Model in Production | Jul 24, 2024 | BenchmarkingLanguage Modelling | —Unverified | 0 |
| Flexible Generation of Preference Data for Recommendation Analysis | Jul 23, 2024 | BenchmarkingRecommendation Systems | CodeCode Available | 0 |
| Can time series forecasting be automated? A benchmark and analysis | Jul 23, 2024 | BenchmarkingDecision Making | —Unverified | 0 |
| Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models | Jul 23, 2024 | BenchmarkingSegmentation | CodeCode Available | 0 |
| Hi-EF: Benchmarking Emotion Forecasting in Human-interaction | Jul 23, 2024 | Benchmarking | CodeCode Available | 0 |
| BONES: a Benchmark fOr Neural Estimation of Shapley values | Jul 23, 2024 | Benchmarking | CodeCode Available | 0 |
| StylusAI: Stylistic Adaptation for Robust German Handwritten Text Generation | Jul 22, 2024 | BenchmarkingText Generation | —Unverified | 0 |
| Customized Retrieval Augmented Generation and Benchmarking for EDA Tool Documentation QA | Jul 22, 2024 | BenchmarkingContrastive Learning | CodeCode Available | 0 |
| Benchmarks as Microscopes: A Call for Model Metrology | Jul 22, 2024 | Benchmarkingmodel | —Unverified | 0 |