Find Paper, Faster
Example:10.1021/acsami.1c06204 or Chem. Rev., 2007, 107, 2411-2502
Improved Variance Reduction Methods for Riemannian non-Convex Optimization.
IEEE Transactions on Pattern Analysis and Machine Intelligence  (IF16.389),  Pub Date : 2021-09-13, DOI: 10.1109/tpami.2021.3112139
Andi Han,Junbin Gao

Variance reduction is popular in accelerating gradient descent and stochastic gradient descent for optimization problems defined on both Euclidean space and Riemannian manifold. This paper further improves on existing variance reduction methods for non-convex Riemannian optimization, including R-SVRG and R-SRG/R-SPIDER by providing a unified framework for batch size adaptation. Such framework is more general than the existing works by considering retraction and vector transport and mini-batch stochastic gradients. We show that the adaptive-batch variance reduction methods require lower gradient complexities for both general non-convex and gradient dominated functions, under both finite-sum and online optimization settings. Moreover, under the new framework, we complete the analysis of R-SVRG and R-SRG, which is currently missing in the literature. We prove convergence of R-SVRG with much simpler analysis, which leads to curvature-free complexity bounds. We also show improved results for R-SRG under double-loop convergence, which match the optimal complexities as the R-SPIDER. In addition, we prove the first online complexity results for R-SVRG and R-SRG. Lastly, we discuss the potential of adapting batch size for non-smooth, constrained and second-order Riemannian optimizers. Extensive experiments on a variety of applications support the analysis and claims in the paper.