Understanding Stochastic Natural Gradient Variational Inference
Kaiwen Wu, Jacob R. Gardner
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Stochastic natural gradient variational inference (NGVI) is a popular posterior inference method with applications in various probabilistic models. Despite its wide usage, little is known about the non-asymptotic convergence rate in the stochastic setting. We aim to lessen this gap and provide a better understanding. For conjugate likelihoods, we prove the first O(1T) non-asymptotic convergence rate of stochastic NGVI. The complexity is no worse than stochastic gradient descent ( black-box variational inference) and the rate likely has better constant dependency that leads to faster convergence in practice. For non-conjugate likelihoods, we show that stochastic NGVI with the canonical parameterization implicitly optimizes a non-convex objective. Thus, a global convergence rate of O(1T) is unlikely without some significant new understanding of optimizing the ELBO using natural gradients.