Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
2023-04-18Unverified0· sign in to hype
Jing An, Jianfeng Lu
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study the convergence of stochastic gradient descent (SGD) for non-convex objective functions. We establish the local convergence with positive probability under the local ojasiewicz condition introduced by Chatterjee in chatterjee2022convergence and an additional local structural assumption of the loss function landscape. A key component of our proof is to ensure that the whole trajectories of SGD stay inside the local region with a positive probability. We also provide examples of neural networks with finite widths such that our assumptions hold.