Limited-Memory BFGS with Displacement Aggregation

A displacement aggregation strategy is proposed for the curvature pairs stored in a limited-memory BFGS (a.k.a. L-BFGS) method such that the resulting (inverse) Hessian approximations are equal to those that would be derived from a full-memory BFGS method. This means that, if a sufficiently large number of pairs are stored, then an optimization algorithm employing … Read more

Gradient Sampling Methods for Nonsmooth Optimization

This paper reviews the gradient sampling methodology for solving nonsmooth, nonconvex optimization problems. An intuitively straightforward gradient sampling algorithm is stated and its convergence properties are summarized. Throughout this discussion, we emphasize the simplicity of gradient sampling as an extension of the steepest descent method for minimizing smooth objectives. We then provide overviews of various … Read more

A Dynamic Penalty Parameter Updating Strategy for Matrix-Free Sequential Quadratic Optimization

This paper focuses on the design of sequential quadratic optimization (commonly known as SQP) methods for solving large-scale nonlinear optimization problems. The most computationally demanding aspect of such an approach is the computation of the search direction during each iteration, for which we consider the use of matrix-free methods. In particular, we develop a method … Read more

ADMM for Multiaffine Constrained Optimization

We propose an expansion of the scope of the alternating direction method of multipliers (ADMM). Specifically, we show that ADMM, when employed to solve problems with multiaffine constraints that satisfy certain easily verifiable assumptions, converges to the set of constrained stationary points if the penalty parameter in the augmented Lagrangian is sufficiently large. When the … Read more

Concise Complexity Analyses for Trust-Region Methods

Concise complexity analyses are presented for simple trust region algorithms for solving unconstrained optimization problems. In contrast to a traditional trust region algorithm, the algorithms considered in this paper require certain control over the choice of trust region radius after any successful iteration. The analyses highlight the essential algorithm components required to obtain certain complexity … Read more

Regional Complexity Analysis of Algorithms for Nonconvex Smooth Optimization

A strategy is proposed for characterizing the worst-case performance of algorithms for solving nonconvex smooth optimization problems. Contemporary analyses characterize worst-case performance by providing, under certain assumptions on an objective function, an upper bound on the number of iterations (or function or derivative evaluations) required until a pth-order stationarity condition is approximately satisfied. This arguably … Read more

A Stochastic Trust Region Algorithm Based on Careful Step Normalization

An algorithm is proposed for solving stochastic and finite sum minimization problems. Based on a trust region methodology, the algorithm employs normalized steps, at least as long as the norms of the stochastic gradient estimates are within a specified interval. The complete algorithm—which dynamically chooses whether or not to employ normalized steps—is proved to have … Read more

An Accelerated Communication-Efficient Primal-Dual Optimization Framework for Structured Machine Learning

Distributed optimization algorithms are essential for training machine learning models on very large-scale datasets. However, they often suffer from communication bottlenecks. Confronting this issue, a communication-efficient primal-dual coordinate ascent framework (CoCoA) and its improved variant CoCoA+ have been proposed, achieving a convergence rate of $\mathcal{O}(1/t)$ for solving empirical risk minimization problems with Lipschitz continuous losses. … Read more

A Self-Correcting Variable-Metric Algorithm Framework for Nonsmooth Optimization

An algorithm framework is proposed for minimizing nonsmooth functions. The framework is variable-metric in that, in each iteration, a step is computed using a symmetric positive definite matrix whose value is updated as in a quasi-Newton scheme. However, unlike previously proposed variable-metric algorithms for minimizing nonsmooth functions, the framework exploits self-correcting properties made possible through … Read more

An Inexact Regularized Newton Framework with a Worst-Case Iteration Complexity of $\mathcal{O}(\epsilon^{-3/2})$ for Nonconvex Optimization

An algorithm for solving smooth nonconvex optimization problems is proposed that, in the worst-case, takes $\mathcal{O}(\epsilon^{-3/2})$ iterations to drive the norm of the gradient of the objective function below a prescribed positive real number $\epsilon$ and can take $\mathcal{O}(\epsilon^{-3})$ iterations to drive the leftmost eigenvalue of the Hessian of the objective above $-\epsilon$. The proposed … Read more