Projection-Free Adaptive Gradients for Large-Scale Optimization
Cyrille W. Combettes, Christoph Spiegel, Sebastian Pokutta
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/ZIB-IOL/StochasticFrankWolfepytorch★ 10
Abstract
The complexity in large-scale optimization can lie in both handling the objective function and handling the constraint set. In this respect, stochastic Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens, by querying only approximate first-order information from the objective and by maintaining feasibility of the iterates without using projections. In this paper, we improve the quality of their first-order information by blending in adaptive gradients. We derive convergence rates and demonstrate the computational advantage of our method over the state-of-the-art stochastic Frank-Wolfe algorithms on both convex and nonconvex objectives. The experiments further show that our method can improve the performance of adaptive gradient algorithms for constrained optimization.