Complexity of a Projected Newton-CG Method for Optimization with Bounds

This paper describes a method for solving smooth nonconvex minimization problems subject to bound constraints with good worst-case complexity and practical performance. The method contains elements of two existing methods: the classical gradient projection approach for bound-constrained optimization and a recently proposed Newton-conjugate gradient algorithm for unconstrained nonconvex optimization. Using a new definition of approximate … Read more

A Line-Search Descent Algorithm for Strict Saddle Functions with Complexity Guarantees

We describe a line-search algorithm which achieves the best-known worst-case complexity results for problems with a certain “strict saddle” property that has been observed to hold in low-rank matrix optimization problems. Our algorithm is adaptive, in the sense that it makes use of backtracking line searches and does not require prior knowledge of the parameters … Read more

Adversarial Classification via Distributional Robustness with Wasserstein Ambiguity

We study a model for adversarial classification based on distributionally robust chance constraints. We show that under Wasserstein ambiguity, the model aims to minimize the conditional value-at-risk of the distance to misclassification, and we explore links to adversarial classification models proposed earlier and to maximum-margin classifiers. We also provide a reformulation of the distributionally robust … Read more

Variable Smoothing for Weakly Convex Composite Functions

We study minimization of a structured objective function, being the sum of a smooth function and a composition of a weakly convex function with a linear operator. Applications include image reconstruction problems with regularizers that introduce less bias than the standard convex regularizers. We develop a variable smoothing algorithm, based on the Moreau envelope with … Read more

A Distributed Quasi-Newton Algorithm for Primal and Dual Regularized Empirical Risk Minimization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving empirical risk minimization (ERM) problems with a nonsmooth regularization term. Our algorithm is applicable to both the primal and the dual ERM problem. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or … Read more

Trust-Region Newton-CG with Strong Second-Order Complexity Guarantees for Nonconvex Optimization

Worst-case complexity guarantees for nonconvex optimization algorithms have been a topic of growing interest. Multiple frameworks that achieve the best known complexity bounds among a broad class of first- and second-order strategies have been proposed. These methods have often been designed primarily with complexity guarantees in mind and, as a result, represent a departure from … Read more

Computing Estimators of Dantzig Selector type via Column and Constraint Generation

We consider a class of linear-programming based estimators in reconstructing a sparse signal from linear measurements. Specific formulations of the reconstruction problem considered here include Dantzig selector, basis pursuit (for the case in which the measurements contain no errors), and the fused Dantzig selector (for the case in which the underlying signal is piecewise constant). … Read more

Complexity of Proximal augmented Lagrangian for nonconvex optimization with nonlinear equality constraints

We analyze worst-case complexity of a Proximal augmented Lagrangian (Proximal AL) framework for nonconvex optimization with nonlinear equality constraints. When an approximate first-order (second-order) optimal point is obtained in the subproblem, an $\epsilon$ first-order (second-order) optimal point for the original problem can be guaranteed within $\mathcal{O}(1/ \epsilon^{2 – \eta})$ outer iterations (where $\eta$ is a … Read more

A Log-Barrier Newton-CG Method for Bound Constrained Optimization with Complexity Guarantees

We describe an algorithm based on a logarithmic barrier function, Newton’s method, and linear conjugate gradients, that obtains an approximate minimizer of a smooth function over the nonnegative orthant. We develop a bound on the complexity of the approach, stated in terms of the required accuracy and the cost of a single gradient evaluation of … Read more

First-Order Algorithms Converge Faster than (1/k)$ on Convex Problems

It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate … Read more