## A momentum-based linearized augmented Lagrangian method for nonconvex constrained stochastic optimization

Nonconvex constrained stochastic optimization  has emerged in many important application areas. With general functional constraints it minimizes the sum of an expectation function and a convex  nonsmooth  regularizer. Main challenges  arise due to the stochasticity in the random integrand and the possibly nonconvex functional constraints. To cope with these issues we propose a … Read more

## New Penalized Stochastic Gradient Methods for Linearly Constrained Strongly Convex Optimization

For minimizing a strongly convex objective function subject to linear inequality constraints, we consider a penalty approach that allows one to utilize stochastic methods for problems with a large number of constraints and/or objective function terms. We provide upper bounds on the distance between the solutions to the original constrained problem and the penalty reformulations, … Read more

## Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to … Read more

## Inexact proximal stochastic second-order methods for nonconvex composite optimization

In this paper, we propose a framework of Inexact Proximal Stochastic Second-order (IPSS) methods for solving nonconvex optimization problems, whose objective function consists of an average of finitely many, possibly weakly, smooth functions and a convex but possibly nons- mooth function. At each iteration, IPSS inexactly solves a proximal subproblem constructed by using some positive … Read more

## A linearly convergent stochastic recursive gradient method for convex optimization

The stochastic recursive gradient algorithm (SARAH) [8] attracts much interest recently. It admits a simple recursive framework for updating stochastic gradient estimates. Motivated by this, in this paper, we propose a SARAH-I method incorporating importance sampling, whose linear conver- gence rate of the sequence of distances between iterates and the optima set is proven under … Read more

## Generalized Stochastic Frank-Wolfe Algorithm with Stochastic “Substitute” Gradient for Structured Convex Optimization

The stochastic Frank-Wolfe method has recently attracted much general interest in the context of optimization for statistical and machine learning due to its ability to work with a more general feasible region. However, there has been a complexity gap in the guaranteed convergence rate for stochastic Frank-Wolfe compared to its deterministic counterpart. In this work, … Read more

## The Adaptive Sampling Gradient Method: Optimizing Smooth Functions with an Inexact Oracle

Consider settings such as stochastic optimization where a smooth objective function $f$ is unknown but can be estimated with an \emph{inexact oracle} such as quasi-Monte Carlo (QMC) or numerical quadrature. The inexact oracle is assumed to yield function estimates having error that decays with increasing oracle effort. For solving such problems, we present the Adaptive … Read more

## Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Lojasiewicz (PL) … Read more

## On Sampling Rates in Simulation-Based Recursions

We consider the context of “simulation-based recursions,” that is, recursions that involve quantities needing to be estimated using a stochastic simulation. Examples include stochastic adaptations of fixed-point and gradient descent recursions obtained by replacing function and derivative values appearing within the recursion by their Monte Carlo counterparts. The primary motivating settings are Simulation Optimization and … Read more

## On the convergence of stochastic bi-level gradient methods

We analyze the convergence of stochastic gradient methods for bi-level optimization problems. We address two specific cases: first when the outer objective function can be expressed as a finite sum of independent terms, and next when both the outer and inner objective functions can be expressed as finite sums of independent terms. We assume Lipschitz … Read more