New Penalized Stochastic Gradient Methods for Linearly Constrained Strongly Convex Optimization

For minimizing a strongly convex objective function subject to linear inequality constraints, we consider a penalty approach that allows one to utilize stochastic methods for problems with a large number of constraints and/or objective function terms. We provide upper bounds on the distance between the solutions to the original constrained problem and the penalty reformulations, … Read more

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to … Read more

Inexact proximal stochastic second-order methods for nonconvex composite optimization

In this paper, we propose a framework of Inexact Proximal Stochastic Second-order (IPSS) methods for solving nonconvex optimization problems, whose objective function consists of an average of finitely many, possibly weakly, smooth functions and a convex but possibly nons- mooth function. At each iteration, IPSS inexactly solves a proximal subproblem constructed by using some positive … Read more

A linearly convergent stochastic recursive gradient method for convex optimization

The stochastic recursive gradient algorithm (SARAH) [8] attracts much interest recently. It admits a simple recursive framework for updating stochastic gradient estimates. Motivated by this, in this paper, we propose a SARAH-I method incorporating importance sampling, whose linear conver- gence rate of the sequence of distances between iterates and the optima set is proven under … Read more

Generalized Stochastic Frank-Wolfe Algorithm with Stochastic “Substitute” Gradient for Structured Convex Optimization

The stochastic Frank-Wolfe method has recently attracted much general interest in the context of optimization for statistical and machine learning due to its ability to work with a more general feasible region. However, there has been a complexity gap in the guaranteed convergence rate for stochastic Frank-Wolfe compared to its deterministic counterpart. In this work, … Read more

The Adaptive Sampling Gradient Method: Optimizing Smooth Functions with an Inexact Oracle

Consider settings such as stochastic optimization where a smooth objective function $f$ is unknown but can be estimated with an \emph{inexact oracle} such as quasi-Monte Carlo (QMC) or numerical quadrature. The inexact oracle is assumed to yield function estimates having error that decays with increasing oracle effort. For solving such problems, we present the Adaptive … Read more

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Lojasiewicz (PL) … Read more

On Sampling Rates in Simulation-Based Recursions

We consider the context of “simulation-based recursions,” that is, recursions that involve quantities needing to be estimated using a stochastic simulation. Examples include stochastic adaptations of fixed-point and gradient descent recursions obtained by replacing function and derivative values appearing within the recursion by their Monte Carlo counterparts. The primary motivating settings are Simulation Optimization and … Read more

On the convergence of stochastic bi-level gradient methods

We analyze the convergence of stochastic gradient methods for bi-level optimization problems. We address two specific cases: first when the outer objective function can be expressed as a finite sum of independent terms, and next when both the outer and inner objective functions can be expressed as finite sums of independent terms. We assume Lipschitz … Read more

Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints

Consider convex optimization problems subject to a large number of constraints. We focus on stochastic problems in which the objective takes the form of expected values and the feasible set is the intersection of a large number of convex sets. We propose a class of algorithms that perform both stochastic gradient descent and random feasibility … Read more