SOTAVerified

Freeing Hybrid Distributed AI Training Configuration

2021-08-20Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering 2021Code Available2· sign in to hype

Haoran Wang

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Deep neural network (DNN) has become the leading technology to realize Artificial Intelligence (AI). As DNN models become larger and more complex, so do datasets. Being able to efficiently train DNNs in parallel has become a crucial need. Data Parallelism (DP) is the widest-used solution today to accelerate DNN training but could be inefficient when processing DNNs with large-size parameters. Hybrid Parallelism (HP), which applies different parallel strategies on different parts of DNNs, is more efficient but requires advanced configurations. Not all AI researchers are experts in parallel computing, thus automating the configuration of HP strategies is very desirable for all AI frameworks. We propose a parallel semantics analysis method, which can analyze the trade-offs among different kinds of parallelisms and systematically choose the HP strategies with good training time performance. We demonstrated experimentally 260% speedup when applying our method compared to using a conventional DP approach. With our proposal, AI researchers would be able to focus more on AI algorithm research without being disturbed by parallel analysis and engineering concerns.

Reproductions