Mitigating Data Absence in Federated Learning Using Privacy-Controllable Data Digests
Chih-Fan Hsu, Ming-Ching Chang, Wei-Chao Chen
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The absence of training data and their distribution changes in federated learning (FL) can significantly undermine model performance, especially in cross-silo scenarios. To address this challenge, we introduce the Federated Learning with Data Digest (FedDig) framework. FedDig manages unexpected distribution changes using a novel privacy-controllable data digest representation. This framework allows FL users to adjust the protection levels of the digest by manipulating hyperparameters that control the mixing of multiple low-dimensional features and applying differential privacy perturbation to these mixed features. Evaluation of FedDig across four diverse public datasets shows that it consistently outperforms five baseline algorithms by substantial margins in various data absence scenarios. We also thoroughly explored FedDig's hyperparameters, demonstrating its adaptability. Notably, the FedDig plugin design is inherently extensible and compatible with existing FL algorithms.