γ-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning
Rongwei Lu, Yutong Jiang, Jinrui Zhang, Chunyang Li, Yifei Zhu, Bin Chen, Zhi Wang
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Gradient compression can effectively alleviate communication bottlenecks in Federated Learning (FL). Contemporary state-of-the-art sparse compressors, such as Top-k, exhibit high computational complexity, up to O(d_2k), where d is the number of model parameters. The hard-threshold compressor, which simply transmits elements with absolute values higher than a fixed threshold, is thus proposed to reduce the complexity to O(d). However, the hard-threshold compression causes accuracy degradation in FL, where the datasets are non-IID and the stepsize is decreasing for model convergence. The decaying stepsize reduces the updates and causes the compression ratio of the hard-threshold compression to drop rapidly to an aggressive ratio. At or below this ratio, the model accuracy has been observed to degrade severely. To address this, we propose -FedHT, a stepsize-aware low-cost compressor with Error-Feedback to guarantee convergence. Given that the traditional theoretical framework of FL does not consider Error-Feedback, we introduce the fundamental conversation of Error-Feedback. We prove that -FedHT has the convergence rate of O(1T) (T representing total training iterations) under -strongly convex cases and O(1T) under non-convex cases, same as FedAVG. Extensive experiments demonstrate that -FedHT improves accuracy by up to 7.42\% over Top-k under equal communication traffic on various non-IID image datasets.