A Progressive Batching L-BFGS Method for Machine Learning

The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise … Read more

The condition of a function relative to a polytope

The condition number of a smooth convex function, namely the ratio of its smoothness to strong convexity constants, is closely tied to fundamental properties of the function. In particular, the condition number of a quadratic convex function is precisely the square of the diameter-to-width ratio of a canonical ellipsoid associated to the function. Furthermore, the … Read more

On self-concordant barriers for generalized power cones

In the study of interior-point methods for nonsymmetric conic optimization and their applications, Nesterov introduced the power cone, together with a 4-self-concordant barrier for it. In his PhD thesis, Chares found an improved 3-self-concordant barrier for the power cone. In addition, he introduced the generalized power cone, and conjectured a nearly optimal self-concordant barrier for … Read more

An Alternating Minimization Method for Matrix Completion Problem

In this paper, we focus on solving matrix completion problem arising from applications in the fields of information theory, statistics, engineering, etc. However, the matrix completion problem involves nonconvex rank constraints which make this type of problem difficult to handle. Traditional approaches use a nuclear norm surrogate to replace the rank constraints. The relaxed model … Read more

On Quasi-Newton Forward–Backward Splitting: Proximal Calculus and Convergence

We introduce a framework for quasi-Newton forward–backward splitting algorithms (proximal quasi-Newton methods) with a metric induced by diagonal +/- rank-r symmetric positive definite matrices. This special type of metric allows for a highly efficient evaluation of the proximal mapping. The key to this efficiency is a general proximal calculus in the new metric. By using … Read more

Subsampled Inexact Newton methods for minimizing large sums of convex functions

This paper deals with the minimization of large sum of convex functions by Inexact Newton (IN) methods employing subsampled Hessian approximations. The Conjugate Gradient method is used to compute the inexact Newton step and global convergence is enforced by a nonmonotone line search procedure. The aim is to obtain methods with affordable costs and fast … Read more

Proximal Alternating Penalty Algorithms for Nonsmooth Constrained Convex Optimization

We develop two new proximal alternating penalty algorithms to solve a wide range class of constrained convex optimization problems. Our approach mainly relies on a novel combination of the classical quadratic penalty, alternating, Nesterov’s acceleration, and homotopy techniques. The first algorithm is designed to solve generic and possibly nonsmooth constrained convex problems without requiring any … Read more

Douglas-Rachford Splitting for Pathological Convex Optimization

Despite the vast literature on DRS, there has been very little work analyzing their behavior under pathologies. Most analyses assume a primal solution exists, a dual solution exists, and strong duality holds. When these assumptions are not met, i.e., under pathologies, the theory often breaks down and the empirical performance may degrade significantly. In this … Read more

Non-stationary Douglas-Rachford and alternating direction method of multipliers: adaptive stepsizes and convergence

We revisit the classical Douglas-Rachford (DR) method for finding a zero of the sum of two maximal monotone operators. Since the practical performance of the DR method crucially depends on the stepsizes, we aim at developing an adaptive stepsize rule. To that end, we take a closer look at a linear case of the problem … Read more

Simplified Versions of the Conditional Gradient Method

We suggest simple modifications of the conditional gradient method for smooth optimization problems, which maintain the basic convergence properties, but reduce the implementation cost of each iteration essentially. Namely, we propose the step-size procedure without any line-search, and inexact solution of the direction finding subproblem. Preliminary results of computational tests confirm efficiency of the proposed … Read more