First-Order Algorithms Converge Faster than (1/k)$ on Convex Problems

It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate … Read more

Parallelizing Subgradient Methods for the Lagrangian Dual in Stochastic Mixed-Integer Programming

The dual decomposition of stochastic mixed-integer programs can be solved by the projected subgradient algorithm. We show how to make this algorithm more amenable to parallelization in a master-worker model by describing two approaches, which can be combined in a natural way. The first approach partitions the scenarios into batches, and makes separate use of … Read more

Inexact Variable Metric Stochastic Block-Coordinate Descent for Regularized Optimization

Block-coordinate descent (BCD) is a popular framework for large-scale regularized optimization problems with block-separable structure. Existing methods have several limitations. They often assume that subproblems can be solved exactly at each iteration, which in practical terms usually restricts the quadratic term in the subproblem to be diagonal, thus losing most of the benefits of higher-order … Read more

A Newton-CG Algorithm with Complexity Guarantees for Smooth Unconstrained Optimization

We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton’s method and linear conjugate gradient, with explicit detection and use of negative curvature directions for the Hessian of the objective function. The algorithm tracks Newton-conjugate gradient procedures developed in the 1980s closely, but includes enhancements that allow worst-case complexity … Read more

A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or work only for specific regularizers. Our algorithm uses successive quadratic approximations, and we describe how to … Read more

Inexact Successive Quadratic Approximation for Regularized Optimization

Successive quadratic approximations, or second-order proximal methods, are useful for minimizing functions that are a sum of a smooth part and a convex, possibly nonsmooth part that promotes regularization. Most analyses of iteration complexity focus on the special case of proximal gradient method, or accelerated variants thereof. There have been only a few studies of … Read more

Efficient Convex Optimization for Linear MPC

Model predictive control (MPC) formulations with linear dynamics and quadratic objectives can be solved efficiently by using a primal-dual interior-point framework, with complexity proportional to the length of the horizon. An alternative, which is more able to exploit the similarity of the problems that are solved at each decision point of linear MPC, is to … Read more

Using Neural Networks to Detect Line Outages from PMU Data

We propose an approach based on neural networks and the AC power flow equations to identify single- and double- line outages in a power grid using the information from phasor measurement unit sensors (PMUs). Rather than inferring the outage from the sensor data by inverting the physical model, our approach uses the AC model to … Read more

Behavior of accelerated gradient methods near critical points of nonconvex functions

We examine the behavior of accelerated gradient methods in smooth nonconvex unconstrained optimization, focusing in particular on their behavior near strict saddle points. Accelerated methods are iterative methods that typically step along a direction that is a linear combination of the previous step and the gradient of the function evaluated at a point at or … Read more

Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization

There has been much recent interest in finding unconstrained local minima of smooth functions, due in part of the prevalence of such problems in machine learning and robust statistics. A particular focus is algorithms with good complexity guarantees. Second-order Newton-type methods that make use of regularization and trust regions have been analyzed from such a … Read more