| Measuring CLEVRness: Black-box Testing of Visual Reasoning Models | Sep 29, 2021 | BenchmarkingDiagnostic | —Unverified | 0 |
| Modelling neuronal behaviour with time series regression: Recurrent Neural Networks on synthetic C. elegans data | Sep 29, 2021 | Benchmarkingregression | —Unverified | 0 |
| Benchmarking Algorithms from Machine Learning for Low-Budget Black-Box Optimization | Sep 29, 2021 | Bayesian OptimizationBenchmarking | —Unverified | 0 |
| Benchmarking Sample Selection Strategies for Batch Reinforcement Learning | Sep 29, 2021 | BenchmarkingImitation Learning | —Unverified | 0 |
| FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation | Sep 29, 2021 | BenchmarkingImage Generation | —Unverified | 0 |
| A Systematic Evaluation of Domain Adaptation Algorithms On Time Series Data | Sep 29, 2021 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| Decentralized Learning for Overparameterized Problems: A Multi-Agent Kernel Approximation Approach | Sep 29, 2021 | Benchmarking | —Unverified | 0 |
| Benchmarking Machine Learning Robustness in Covid-19 Spike Sequence Classification | Sep 29, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation | Sep 29, 2021 | BenchmarkingPhilosophy | CodeCode Available | 1 |
| "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations | Sep 28, 2021 | BenchmarkingDialogue State Tracking | CodeCode Available | 1 |