An Accelerated Communication-Efficient Primal-Dual Optimization Framework for Structured Machine Learning

Distributed optimization algorithms are essential for training machine learning models on very large-scale datasets. However, they often suffer from communication bottlenecks. Confronting this issue, a communication-efficient primal-dual coordinate ascent framework (CoCoA) and its improved variant CoCoA+ have been proposed, achieving a convergence rate of $\mathcal{O}(1/t)$ for solving empirical risk minimization problems with Lipschitz continuous losses. … Read more

A Self-Correcting Variable-Metric Algorithm Framework for Nonsmooth Optimization

An algorithm framework is proposed for minimizing nonsmooth functions. The framework is variable-metric in that, in each iteration, a step is computed using a symmetric positive definite matrix whose value is updated as in a quasi-Newton scheme. However, unlike previously proposed variable-metric algorithms for minimizing nonsmooth functions, the framework exploits self-correcting properties made possible through … Read more

An Inexact Regularized Newton Framework with a Worst-Case Iteration Complexity of $\mathcal{O}(\epsilon^{-3/2})$ for Nonconvex Optimization

An algorithm for solving smooth nonconvex optimization problems is proposed that, in the worst-case, takes $\mathcal{O}(\epsilon^{-3/2})$ iterations to drive the norm of the gradient of the objective function below a prescribed positive real number $\epsilon$ and can take $\mathcal{O}(\epsilon^{-3})$ iterations to drive the leftmost eigenvalue of the Hessian of the objective above $-\epsilon$. The proposed … Read more

Exploiting Negative Curvature in Deterministic and Stochastic Optimization

This paper addresses the question of whether it can be beneficial for an optimization algorithm to follow directions of negative curvature. Although some prior work has established convergence results for algorithms that integrate both descent and negative curvature directions, there has not yet been numerical evidence showing that such methods offer significant performance improvements. In … Read more

Complexity Analysis of a Trust Funnel Algorithm for Equality Constrained Optimization

A method is proposed for solving equality constrained nonlinear optimization problems involving twice continuously differentiable functions. The method employs a trust funnel approach consisting of two phases: a first phase to locate an $\epsilon$-feasible point and a second phase to seek optimality while maintaining at least $\epsilon$-feasibility. A two-phase approach of this kind based on … Read more

R-Linear Convergence of Limited Memory Steepest Descent

The limited memory steepest descent method (LMSD) proposed by Fletcher is an extension of the Barzilai-Borwein “two-point step size” strategy for steepest descent methods for solving unconstrained optimization problems. It is known that the Barzilai-Borwein strategy yields a method with an R-linear rate of convergence when it is employed to minimize a strongly convex quadratic. … Read more

A Sequential Algorithm for Solving Nonlinear Optimization Problems with Chance Constraints

An algorithm is presented for solving nonlinear optimization problems with chance constraints, i.e., those in which a constraint involving an uncertain parameter must be satisfied with at least a minimum probability. In particular, the algorithm is designed to solve cardinality-constrained nonlinear optimization problems that arise in sample average approximations of chance-constrained problems, as well as … Read more

Optimization Methods for Large-Scale Machine Learning

This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of … Read more

A Reduced-Space Algorithm for Minimizing $\ell_1hBcRegularized Convex Functions

We present a new method for minimizing the sum of a differentiable convex function and an $\ell_1$-norm regularizer. The main features of the new method include: $(i)$ an evolving set of indices corresponding to variables that are predicted to be nonzero at a solution (i.e., the support); $(ii)$ a reduced-space subproblem defined in terms of … Read more

A BFGS-SQP Method for Nonsmooth, Nonconvex, Constrained Optimization and its Evaluation using Relative Minimization Profiles

We propose an algorithm for solving nonsmooth, nonconvex, constrained optimization problems as well as a new set of visualization tools for comparing the performance of optimization algorithms. Our algorithm is a sequential quadratic optimization method that employs Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton Hessian approximations and an exact penalty function whose parameter is controlled using a steering strategy. … Read more