Is normalization indispensable for training deep neural network?

2020-12-01NeurIPS 2020Code Available1· sign in to hype

Jie Shao, Kai Hu, Changhu Wang, xiangyang xue, Bhiksha Raj

Code Available — Be the first to reproduce this paper.

Code

github.com/hukkai/rescaling
OfficialIn paperpytorch★ 34

Abstract

Normalization operations are widely used to train deep neural networks, and they can improve both convergence and generalization in most tasks. The theories for normalization's effectiveness and new forms of normalization have always been hot topics in research. To better understand normalization, one question can be whether normalization is indispensable for training deep neural network? In this paper, we study what would happen when normalization layers are removed from the network, and show how to train deep neural networks without normalization layers and without performance degradation. Our proposed method can achieve the same or even slightly better performance in a variety of tasks: image classification in ImageNet, object detection and segmentation in MS-COCO, video classification in Kinetics, and machine translation in WMT English-German, etc. Our study may help better understand the role of normalization layers and can be a competitive alternative to normalization layers. Codes are available.

Tasks

General Classification image-classification Image Classification Machine Translation object-detection Object Detection Translation Video Classification

Is normalization indispensable for training deep neural network?

Code

Abstract

Tasks

Reproductions