Training Structured Neural Networks Through Manifold Identification and Variance Reduction

This paper proposes an algorithm, RMDA, for training neural networks (NNs) with a regularization term for promoting desired structures. RMDA does not incur computation additional to proximal SGD with momentum, and achieves variance reduction without requiring the objective function to be of the finite-sum form. Through the tool of manifold identification from nonlinear optimization, we … Read more

Relatively-Smooth Convex Optimization by First-Order Methods, and Applications

The usual approach to developing and analyzing first-order methods for smooth convex optimization assumes that the gradient of the objective function is uniformly smooth with some Lipschitz constant L. However, in many settings the differentiable convex function f(.) is not uniformly smooth — for example in D-optimal design where f(x):=-ln det(HXH^T), or even the univariate … Read more

Manifold Identification in Dual Averaging for Regularized Stochastic Online Learning

Iterative methods that calculate their steps from approximate subgradient directions have proved to be useful for stochastic learning problems over large and streaming data sets. When the objective consists of a loss function plus a nonsmooth regularization term, the solution often lies on a low-dimensional manifold of parameter space along which the regularizer is smooth. … Read more