Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks

2023-04-18Unverified0· sign in to hype

Jing An, Jianfeng Lu

Unverified — Be the first to reproduce this paper.

Abstract

We study the convergence of stochastic gradient descent (SGD) for non-convex objective functions. We establish the local convergence with positive probability under the local ojasiewicz condition introduced by Chatterjee in chatterjee2022convergence and an additional local structural assumption of the loss function landscape. A key component of our proof is to ensure that the whole trajectories of SGD stay inside the local region with a positive probability. We also provide examples of neural networks with finite widths such that our assumptions hold.

Tasks

valid

Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks

Abstract

Tasks

Reproductions