A Simple Baseline for Bayesian Uncertainty in Deep Learning

2019-02-07NeurIPS 2019Code Available1· sign in to hype

Wesley Maddox, Timur Garipov, Pavel Izmailov, Dmitry Vetrov, Andrew Gordon Wilson

Code Available — Be the first to reproduce this paper.

Code

github.com/wjmaddox/swa_gaussian
OfficialIn paperpytorch★ 477
github.com/abdulmajid-murad/deep_probabilistic_forecast
pytorch★ 27
github.com/SamuelGuilluy/Bayesian_ML_SWAG
pytorch★ 0
github.com/NajibYavari/DD2412
tf★ 0
github.com/YeongHyeon/SWA-Gaussian-TF2
tf★ 0
github.com/lamantinushka/StructuredCovariance
pytorch★ 0

Abstract

We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose approach for uncertainty representation and calibration in deep learning. Stochastic Weight Averaging (SWA), which computes the first moment of stochastic gradient descent (SGD) iterates with a modified learning rate schedule, has recently been shown to improve generalization in deep learning. With SWAG, we fit a Gaussian using the SWA solution as the first moment and a low rank plus diagonal covariance also derived from the SGD iterates, forming an approximate posterior distribution over neural network weights; we then sample from this Gaussian distribution to perform Bayesian model averaging. We empirically find that SWAG approximates the shape of the true posterior, in accordance with results describing the stationary distribution of SGD iterates. Moreover, we demonstrate that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, SGLD, and temperature scaling.

Tasks

Bayesian Inference Deep Learning Transfer Learning Uncertainty Quantification

A Simple Baseline for Bayesian Uncertainty in Deep Learning

Code

Abstract

Tasks

Reproductions