Model Degradation Hinders Deep Graph Neural Networks

2022-06-09Code Available0· sign in to hype

Wentao Zhang, Zeang Sheng, Ziqi Yin, Yuezihan Jiang, Yikuan Xia, Jun Gao, Zhi Yang, Bin Cui

Code Available — Be the first to reproduce this paper.

Code

github.com/pku-dair/air
OfficialIn paperpytorch★ 6

Abstract

Graph Neural Networks (GNNs) have achieved great success in various graph mining tasks.However, drastic performance degradation is always observed when a GNN is stacked with many layers. As a result, most GNNs only have shallow architectures, which limits their expressive power and exploitation of deep neighborhoods.Most recent studies attribute the performance degradation of deep GNNs to the over-smoothing issue. In this paper, we disentangle the conventional graph convolution operation into two independent operations: Propagation (P) and Transformation (T).Following this, the depth of a GNN can be split into the propagation depth (D_p) and the transformation depth (D_t). Through extensive experiments, we find that the major cause for the performance degradation of deep GNNs is the model degradation issue caused by large D_t rather than the over-smoothing issue mainly caused by large D_p. Further, we present Adaptive Initial Residual (AIR), a plug-and-play module compatible with all kinds of GNN architectures, to alleviate the model degradation issue and the over-smoothing issue simultaneously. Experimental results on six real-world datasets demonstrate that GNNs equipped with AIR outperform most GNNs with shallow architectures owing to the benefits of both large D_p and D_t, while the time costs associated with AIR can be ignored.

Tasks

Attribute Graph Mining model

Model Degradation Hinders Deep Graph Neural Networks

Code

Abstract

Tasks

Reproductions