Hardware-Aware Network Transformation

2021-09-29Unverified0· sign in to hype

Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we tackle the problem of network acceleration by proposing hardware-aware network transformation (HANT), an approach that builds on neural architecture search techniques and teacher-student distillation. HANT consists of two phases: in the first phase, it trains many alternative operations for every layer of the teacher network using layer-wise feature map distillation. In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach. In extensive experiments, we show that HANT can successfully accelerate three different families of network architectures (EfficientNetsV1, EfficientNetsV2 and ResNests), over two different target hardware platforms with minimal loss of accuracy. For example, HANT accelerates EfficientNetsV1-B6 by 3.6 with <0.4% drop in top-1 accuracy on ImageNet. When comparing the same latency level, HANT can accelerate EfficientNetV1-B4 to the same latency as EfficientNetV1-B1 while achieving 3% higher accuracy. We also show that applying HANT to EfficientNetV1 results in the automated discovery of the same (qualitative) architecture modifications later incorporated in EfficientNetV2. Finally, HANT’s efficient search allows us to examine a large pool of 197 operations per layer, resulting in new insights into the accuracy-latency tradeoffs for different operations.

Tasks

Neural Architecture Search

Hardware-Aware Network Transformation

Abstract

Tasks

Reproductions