ATM: A distributed, collaborative, scalable system for automated machine learning

2017-12-112017 IEEE International Conference on Big Data (Big Data) 2017Code Available0· sign in to hype

Thomas Swearingen, Will Drevo, Bennett Cyphers, Alfredo Cuesta-Infante, Arun Ross, Kalyan Veeramachaneni

Code Available — Be the first to reproduce this paper.

Code

github.com/HDI-Project/ATM
none★ 0

Abstract

In this paper, we present Auto-Tuned Models, or ATM, a distributed, collaborative, scalable system for automated machine learning. Users of ATM can simply upload a dataset, choose a subset of modeling methods, and choose to use ATM’s hybrid Bayesian and multi-armed bandit optimization system. The distributed system works in a load balanced fashion to quickly deliver results in the form of ready-to-predict models, confusion matrices, cross-validation results, and training timings. By automating hyperparameter tuning and model selection, ATM returns the emphasis of the machine learning workflow to its most irreducible part: feature engineering. We demonstrate the usefulness of ATM on 420 datasets from OpenML and train over 3 million classifiers. Our initial results show ATM can beat human-generated solutions for 30% of the datasets, and can do so in 1/100th of the time.

Tasks

AutoML BIG-bench Machine Learning Feature Engineering Hyperparameter Optimization Model Selection

ATM: A distributed, collaborative, scalable system for automated machine learning

Code

Abstract

Tasks

Reproductions