stochastic gradient – Optimization Online

Alternate Training of Shared and Task-Specific Parameters for Multi-Task Neural Networks

Published: 2024/01/08

Nonlinear Optimization, Stochastic Programming multi-task learning, neural networks, stochastic gradient

This paper introduces novel alternate training procedures for hard-parameter sharing Multi-Task Neural Networks (MTNNs). Traditional MTNN training faces challenges in managing conflicting loss gradients, often yielding sub-optimal performance. The proposed alternate training method updates shared and task-specific weights alternately, exploiting the multi-head architecture of the model. This approach reduces computational costs, enhances training regularization, and … Read more

A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems

Published: 2023/04/28

Qi Wang

Frank E. Curtis

Daniel P. Robinson

Vyacheslav Kungurtsev

Constrained Nonlinear Optimization, Nonlinear Optimization, Optimization in Data Science interior point method, stochastic gradient, stochastic optimization

A stochastic-gradient-based interior-point algorithm for minimizing a continuously differentiable objective function (that may be nonconvex) subject to bound constraints is presented, analyzed, and demonstrated through experimental results. The algorithm is unique from other interior-point methods for solving smooth (nonconvex) optimization problems since the search directions are computed using stochastic gradient estimates. It is also unique … Read more

A momentum-based linearized augmented Lagrangian method for nonconvex constrained stochastic optimization

Published: 2022/08/11, Updated: 2024/07/17

Qiankun Shi

Xiao Wang

Hao Wang

Constrained Nonlinear Optimization, Nonlinear Optimization, Stochastic Programming augmented Lagrangian function, functional constraint, momentum, nonconvex optimization, oracle complexity, stochastic gradient

Nonconvex constrained stochastic optimization has emerged in many important application areas. Subject to general functional constraints it minimizes the sum of an expectation function and a nonsmooth regularizer. Main challenges arise due to the stochasticity in the random integrand and the possibly nonconvex functional constraints. To address these issues we propose a momentum-based linearized augmented … Read more

New Penalized Stochastic Gradient Methods for Linearly Constrained Strongly Convex Optimization

Published: 2022/02/14

Meng Li

Paul Grigas

Alper Atamturk

Constrained Nonlinear Optimization, Convex Optimization convex optimization, duality gap, linear constraints, penalty method, stochastic gradient

For minimizing a strongly convex objective function subject to linear inequality constraints, we consider a penalty approach that allows one to utilize stochastic methods for problems with a large number of constraints and/or objective function terms. We provide upper bounds on the distance between the solutions to the original constrained problem and the penalty reformulations, … Read more

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Published: 2020/08/13

Filip Hanzely

Convex Optimization, Stochastic Programming coordinate descent, cubic newton, machine learning, optimization, stochastic gradient, variance reduction

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to … Read more

Inexact proximal stochastic second-order methods for nonconvex composite optimization

Published: 2019/10/10, Updated: 2019/10/14

Xiao Wang

Hongchao Zhang

Nonlinear Optimization (weakly) smooth function, complexity, inexact subproblem solution, nonconvex, second-order approximation, stochastic gradient, variance reduction

In this paper, we propose a framework of Inexact Proximal Stochastic Second-order (IPSS) methods for solving nonconvex optimization problems, whose objective function consists of an average of finitely many, possibly weakly, smooth functions and a convex but possibly nons- mooth function. At each iteration, IPSS inexactly solves a proximal subproblem constructed by using some positive … Read more

A linearly convergent stochastic recursive gradient method for convex optimization

Published: 2019/04/24

Tiande Guo

Yan Liu

Xiao Wang

Convex and Nonsmooth Optimization, Convex Optimization ®bb method, complexity, linear convergence rate, stochastic gradient, stochastic optimization

The stochastic recursive gradient algorithm (SARAH) [8] attracts much interest recently. It admits a simple recursive framework for updating stochastic gradient estimates. Motivated by this, in this paper, we propose a SARAH-I method incorporating importance sampling, whose linear conver- gence rate of the sequence of distances between iterates and the optima set is proven under … Read more

Generalized Stochastic Frank-Wolfe Algorithm with Stochastic “Substitute” Gradient for Structured Convex Optimization

Published: 2018/07/29, Updated: 2018/09/04

Robert M. Freund

Haihao Lu

Convex Optimization, Nonlinear Optimization, Stochastic Programming complexity, conditional gradient, frank-wolfe, stochastic gradient

The stochastic Frank-Wolfe method has recently attracted much general interest in the context of optimization for statistical and machine learning due to its ability to work with a more general feasible region. However, there has been a complexity gap in the guaranteed convergence rate for stochastic Frank-Wolfe compared to its deterministic counterpart. In this work, … Read more

The Adaptive Sampling Gradient Method: Optimizing Smooth Functions with an Inexact Oracle

Published: 2017/05/29

Fatemeh Hashemi

Raghu Pasupathy

Michael Taaffe

Nonlinear Optimization, Optimization of Simulated Systems, Stochastic Programming adaptive sampling, stochastic gradient, stochastic optimization

Consider settings such as stochastic optimization where a smooth objective function $f$ is unknown but can be estimated with an \emph{inexact oracle} such as quasi-Monte Carlo (QMC) or numerical quadrature. The inexact oracle is assumed to yield function estimates having error that decays with increasing oracle effort. For solving such problems, we present the Adaptive … Read more

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition

Published: 2016/08/16, Updated: 2020/09/12

Hamed Karimi

Mark Schmidt

Julie Nutini

Convex and Nonsmooth Optimization boosting, coordinate descent, gradient descent, l1-regularization, least squares, logistic regression, stochastic gradient, support vector machines, variance reduction

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Lojasiewicz (PL) … Read more