A Log-Barrier Newton-CG Method for Bound Constrained Optimization with Complexity Guarantees

We describe an algorithm based on a logarithmic barrier function, Newton’s method, and linear conjugate gradients, that obtains an approximate minimizer of a smooth function over the nonnegative orthant. We develop a bound on the complexity of the approach, stated in terms of the required accuracy and the cost of a single gradient evaluation of … Read more

First-Order Algorithms Converge Faster than (1/k)$ on Convex Problems

It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate … Read more

Parallelizing Subgradient Methods for the Lagrangian Dual in Stochastic Mixed-Integer Programming

The dual decomposition of stochastic mixed-integer programs can be solved by the projected subgradient algorithm. We show how to make this algorithm more amenable to parallelization in a master-worker model by describing two approaches, which can be combined in a natural way. The first approach partitions the scenarios into batches, and makes separate use of … Read more

Inexact Variable Metric Stochastic Block-Coordinate Descent for Regularized Optimization

Block-coordinate descent (BCD) is a popular framework for large-scale regularized optimization problems with block-separable structure. Existing methods have several limitations. They often assume that subproblems can be solved exactly at each iteration, which in practical terms usually restricts the quadratic term in the subproblem to be diagonal, thus losing most of the benefits of higher-order … Read more

A Newton-CG Algorithm with Complexity Guarantees for Smooth Unconstrained Optimization

We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton’s method and linear conjugate gradient, with explicit detection and use of negative curvature directions for the Hessian of the objective function. The algorithm tracks Newton-conjugate gradient procedures developed in the 1980s closely, but includes enhancements that allow worst-case complexity … Read more

A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or work only for specific regularizers. Our algorithm uses successive quadratic approximations, and we describe how to … Read more

Inexact Successive Quadratic Approximation for Regularized Optimization

Successive quadratic approximations, or second-order proximal methods, are useful for minimizing functions that are a sum of a smooth part and a convex, possibly nonsmooth part that promotes regularization. Most analyses of iteration complexity focus on the special case of proximal gradient method, or accelerated variants thereof. There have been only a few studies of … Read more

Efficient Convex Optimization for Linear MPC

Model predictive control (MPC) formulations with linear dynamics and quadratic objectives can be solved efficiently by using a primal-dual interior-point framework, with complexity proportional to the length of the horizon. An alternative, which is more able to exploit the similarity of the problems that are solved at each decision point of linear MPC, is to … Read more

Using Neural Networks to Detect Line Outages from PMU Data

We propose an approach based on neural networks and the AC power flow equations to identify single- and double- line outages in a power grid using the information from phasor measurement unit sensors (PMUs). Rather than inferring the outage from the sensor data by inverting the physical model, our approach uses the AC model to … Read more

Behavior of accelerated gradient methods near critical points of nonconvex functions

We examine the behavior of accelerated gradient methods in smooth nonconvex unconstrained optimization, focusing in particular on their behavior near strict saddle points. Accelerated methods are iterative methods that typically step along a direction that is a linear combination of the previous step and the gradient of the function evaluated at a point at or … Read more