| NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of Convolutional Neural Network Architecture and Training Hyperparameters | Oct 19, 2021 | 4kBenchmarking | CodeCode Available | 1 |
| GAN-based disentanglement learning for chest X-ray rib suppression | Oct 18, 2021 | BenchmarkingComputed Tomography (CT) | —Unverified | 0 |
| MTG: A Benchmarking Suite for Multilingual Text Generation | Oct 16, 2021 | BenchmarkingQuestion Generation | —Unverified | 0 |
| Benchmarking Biomedical Nested NER and Relation Extraction Models | Oct 16, 2021 | BenchmarkingNER | —Unverified | 0 |
| Multitask Prompted Training Enables Zero-Shot Task Generalization | Oct 15, 2021 | BenchmarkingDecoder | CodeCode Available | 2 |
| HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media | Oct 14, 2021 | 3D Pose EstimationBenchmarking | CodeCode Available | 1 |
| OG-SPACE: Optimized Stochastic Simulation of Spatial Models of Cancer Evolution | Oct 13, 2021 | Benchmarking | CodeCode Available | 0 |
| Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions | Oct 13, 2021 | BenchmarkingComputational Efficiency | CodeCode Available | 1 |
| What can 5.17 billion regression fits tell us about artificial models of the human visual system? | Oct 12, 2021 | Benchmarking | —Unverified | 0 |
| Benchmarking human visual search computational models in natural scenes: models comparison and reference datasets | Oct 12, 2021 | Benchmarking | —Unverified | 0 |
| Codabench: Flexible, Easy-to-Use and Reproducible Benchmarking Platform | Oct 12, 2021 | Benchmarking | CodeCode Available | 1 |
| NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks | Oct 12, 2021 | Benchmarkingimage-classification | CodeCode Available | 1 |
| S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations | Oct 12, 2021 | BenchmarkingVoice Conversion | CodeCode Available | 1 |
| EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale Dataset | Oct 11, 2021 | BenchmarkingFace Hallucination | CodeCode Available | 1 |
| Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking | Oct 11, 2021 | BenchmarkingQuestion Answering | CodeCode Available | 0 |
| The CaLiGraph Ontology as a Challenge for OWL Reasoners | Oct 11, 2021 | BenchmarkingKnowledge Graphs | CodeCode Available | 0 |
| SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records | Oct 11, 2021 | BenchmarkingBinary Classification | CodeCode Available | 0 |
| Performance Evaluation of Deep Transfer Learning on Multiclass Identification of Common Weed Species in Cotton Production Systems | Oct 11, 2021 | BenchmarkingManagement | CodeCode Available | 1 |
| Chaos as an interpretable benchmark for forecasting and data-driven modelling | Oct 11, 2021 | BenchmarkingSymbolic Regression | CodeCode Available | 1 |
| Evolving Evolutionary Algorithms with Patterns | Oct 10, 2021 | BenchmarkingEvolutionary Algorithms | CodeCode Available | 0 |
| Hybrid Random Features | Oct 8, 2021 | Benchmarking | CodeCode Available | 0 |
| Process Extraction from Text: Benchmarking the State of the Art and Paving the Way for Future Challenges | Oct 7, 2021 | BenchmarkingModel extraction | CodeCode Available | 0 |
| Explicitly Multi-Modal Benchmarks for Multi-Objective Optimization | Oct 7, 2021 | Benchmarking | —Unverified | 0 |
| SERAB: A multi-lingual benchmark for speech emotion recognition | Oct 7, 2021 | BenchmarkingEmotion Recognition | CodeCode Available | 1 |
| EntQA: Entity Linking as Question Answering | Oct 5, 2021 | BenchmarkingEntity Linking | CodeCode Available | 1 |
| Revisiting Self-Training for Few-Shot Learning of Language Model | Oct 4, 2021 | BenchmarkingFew-Shot Learning | CodeCode Available | 1 |
| Benchmarking Safety Monitors for Image Classifiers with Machine Learning | Oct 4, 2021 | Autonomous VehiclesBenchmarking | CodeCode Available | 0 |
| A New Approach for Image Authentication Framework for Media Forensics Purpose | Oct 3, 2021 | AstronomyBenchmarking | —Unverified | 0 |
| Machine Learning with Knowledge Constraints for Process Optimization of Open-Air Perovskite Solar Cell Manufacturing | Oct 1, 2021 | Bayesian OptimizationBenchmarking | CodeCode Available | 1 |
| Phonetic Word Embeddings | Sep 30, 2021 | BenchmarkingWord Embeddings | CodeCode Available | 1 |
| A Two-Stage Neural-Filter Pareto Front Extractor and the need for Benchmarking | Sep 29, 2021 | BenchmarkingMulti-Task Learning | —Unverified | 0 |
| NAS-Bench-Zero: A Large Scale Dataset for Understanding Zero-Shot Neural Architecture Search | Sep 29, 2021 | BenchmarkingNeural Architecture Search | —Unverified | 0 |
| Benchmarking person re-identification approaches and training datasets for practical real-world implementations | Sep 29, 2021 | BenchmarkingPedestrian Detection | —Unverified | 0 |
| Deep Learning of Intrinsically Motivated Options in the Arcade Learning Environment | Sep 29, 2021 | Atari GamesBenchmarking | —Unverified | 0 |
| Benchmarking Graph Neural Networks on Dynamic Link Prediction | Sep 29, 2021 | BenchmarkingDynamic Link Prediction | CodeCode Available | 1 |
| Less is more: Selecting the right benchmarking set of data for time series classification | Sep 29, 2021 | BenchmarkingTime Series | —Unverified | 0 |
| Imitation Learning from Pixel Observations for Continuous Control | Sep 29, 2021 | Benchmarkingcontinuous-control | —Unverified | 0 |
| Learning to Schedule Learning rate with Graph Neural Networks | Sep 29, 2021 | Benchmarkingimage-classification | —Unverified | 0 |
| Best Practices in Pool-based Active Learning for Image Classification | Sep 29, 2021 | Active LearningBenchmarking | —Unverified | 0 |
| Stabilized Self-training with Negative Sampling on Few-labeled Graph Data | Sep 29, 2021 | BenchmarkingNode Classification | —Unverified | 0 |
| Measuring CLEVRness: Black-box Testing of Visual Reasoning Models | Sep 29, 2021 | BenchmarkingDiagnostic | —Unverified | 0 |
| Modelling neuronal behaviour with time series regression: Recurrent Neural Networks on synthetic C. elegans data | Sep 29, 2021 | Benchmarkingregression | —Unverified | 0 |
| Benchmarking Algorithms from Machine Learning for Low-Budget Black-Box Optimization | Sep 29, 2021 | Bayesian OptimizationBenchmarking | —Unverified | 0 |
| Benchmarking Sample Selection Strategies for Batch Reinforcement Learning | Sep 29, 2021 | BenchmarkingImitation Learning | —Unverified | 0 |
| FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation | Sep 29, 2021 | BenchmarkingImage Generation | —Unverified | 0 |
| A Systematic Evaluation of Domain Adaptation Algorithms On Time Series Data | Sep 29, 2021 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| Decentralized Learning for Overparameterized Problems: A Multi-Agent Kernel Approximation Approach | Sep 29, 2021 | Benchmarking | —Unverified | 0 |
| Benchmarking Machine Learning Robustness in Covid-19 Spike Sequence Classification | Sep 29, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |
| MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation | Sep 29, 2021 | BenchmarkingPhilosophy | CodeCode Available | 1 |
| "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations | Sep 28, 2021 | BenchmarkingDialogue State Tracking | CodeCode Available | 1 |