Efficient Neural Architecture Search via Parameter Sharing

2018-02-09Code Available1· sign in to hype

Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean

Code Available — Be the first to reproduce this paper.

Code

github.com/DataCanvasIO/Hypernets
tf★ 265
github.com/guoyongcs/NAT
pytorch★ 58
github.com/MengTianjian/enas-pytorch
pytorch★ 48
github.com/guoyongcs/NATv2
pytorch★ 23
github.com/rutgerswiselab/autolossgen
pytorch★ 22
github.com/Ezereal/enas
tf★ 1
github.com/kaileymonn/Quantized-ENAS-ConvNets
tf★ 0
github.com/zbyte64/pytorch-dagsearch
pytorch★ 0
github.com/yashkant/ENAS-Quantized-Neural-Networks
tf★ 0
github.com/countif/enas_nni
tf★ 0

Abstract

We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Thanks to parameter sharing between child models, ENAS is fast: it delivers strong empirical performances using much fewer GPU-hours than all existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On the Penn Treebank dataset, ENAS discovers a novel architecture that achieves a test perplexity of 55.8, establishing a new state-of-the-art among all methods without post-training processing. On the CIFAR-10 dataset, ENAS designs novel architectures that achieve a test error of 2.89%, which is on par with NASNet (Zoph et al., 2018), whose test error is 2.65%.

Tasks

GPU Language Modelling Neural Architecture Search

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Penn Treebank (Word Level)	Efficient NAS	Test perplexity	58.6	—	Unverified

Efficient Neural Architecture Search via Parameter Sharing

Code

Abstract

Tasks

Benchmark Results

Reproductions