A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or work only for specific regularizers. Our algorithm uses successive quadratic approximations, and we describe how to … Read more

Inexact Successive Quadratic Approximation for Regularized Optimization

Successive quadratic approximations, or second-order proximal methods, are useful for minimizing functions that are a sum of a smooth part and a convex, possibly nonsmooth part that promotes regularization. Most analyses of iteration complexity focus on the special case of proximal gradient method, or accelerated variants thereof. There have been only a few studies of … Read more

Using Neural Networks to Detect Line Outages from PMU Data

We propose an approach based on neural networks and the AC power flow equations to identify single- and double- line outages in a power grid using the information from phasor measurement unit sensors (PMUs). Rather than inferring the outage from the sensor data by inverting the physical model, our approach uses the AC model to … Read more

Distributed Block-diagonal Approximation Methods for Regularized Empirical Risk Minimization

Designing distributed algorithms for empirical risk minimization (ERM) has become an active research topic in recent years because of the practical need to deal with the huge volume of data. In this paper, we propose a general framework for training an ERM model via solving its dual problem in parallel over multiple machines. Our method … Read more

Analyzing Random Permutations for Cyclic Coordinate Descent

We consider coordinate descent methods on convex quadratic problems, in which exact line searches are performed at each iteration. (This algorithm is identical to Gauss-Seidel on the equivalent symmetric positive definite linear system.) We describe a class of convex quadratic problems for which the random-permutations version of cyclic coordinate descent (RPCD) outperforms the standard cyclic … Read more

Random permutations fix a worst case for cyclic coordinate descent

Variants of the coordinate descent approach for minimizing a nonlinear function are distinguished in part by the order in which coordinates are considered for relaxation. Three common orderings are cyclic (CCD), in which we cycle through the components of $x$ in order; randomized (RCD), in which the component to update is selected randomly and independently … Read more