Identifiability, the KL property in metric spaces, and subgradient curves

Identifiability, and the closely related idea of partial smoothness, unify classical active set methods and more general notions of solution structure. Diverse optimization algorithms generate iterates in discrete time that are eventually confined to identifiable sets. We present two fresh perspectives on identifiability. The first distills the notion to a simple metric property, applicable not … Read more

Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization

For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is … Read more

The structure of conservative gradient fields

The classical Clarke subdifferential alone is inadequate for understanding automatic differentiation in nonsmooth contexts. Instead, we can sometimes rely on enlarged generalized gradients called “conservative fields”, defined through the natural path-wise chain rule: one application is the convergence analysis of gradient-based deep learning algorithms. In the semi-algebraic case, we show that all conservative fields are … Read more

Disk matrices and the proximal mapping for the numerical radius

Optimal matrices for problems involving the matrix numerical radius often have fields of values that are disks, a phenomenon associated with partial smoothness. Such matrices are highly structured: we experiment in particular with the proximal mapping for the radius, which often maps n-by-n random matrix inputs into a particular manifold of disk matrices that has … Read more

A simple Newton method for local nonsmooth optimization

Superlinear convergence has been an elusive goal for black-box nonsmooth optimization. Even in the convex case, the subgradient method is very slow, and while some cutting plane algorithms, including traditional bundle methods, are popular in practice, local convergence is still sluggish. Faster variants depend either on problem structure or on analyses that elide sequences of … Read more

Active-set Newton methods and partial smoothness

Diverse optimization algorithms correctly identify, in finite time, intrinsic constraints that must be active at optimality. Analogous behavior extends beyond optimization to systems involving partly smooth operators, and in particular to variational inequalities over partly smooth sets. As in classical nonlinear programming, such active-set structure underlies the design of accelerated local algorithms of Newton type. … Read more

Partial smoothness of the numerical radius at matrices whose fields of values are disks

Solutions to optimization problems involving the numerical radius often belong to a special class: the set of matrices having field of values a disk centered at the origin. After illustrating this phenomenon with some examples, we illuminate it by studying matrices around which this set of “disk matrices” is a manifold with respect to which … Read more

Inexact alternating projections on nonconvex sets

Given two arbitrary closed sets in Euclidean space, a simple transversality condition guarantees that the method of alternating projections converges locally, at linear rate, to a point in the intersection. Exact projection onto nonconvex sets is typically intractable, but we show that computationally-cheap inexact projections may suffice instead. In particular, if one set is defined … Read more

Gradient Sampling Methods for Nonsmooth Optimization

This paper reviews the gradient sampling methodology for solving nonsmooth, nonconvex optimization problems. An intuitively straightforward gradient sampling algorithm is stated and its convergence properties are summarized. Throughout this discussion, we emphasize the simplicity of gradient sampling as an extension of the steepest descent method for minimizing smooth objectives. We then provide overviews of various … Read more

BFGS convergence to nonsmooth minimizers of convex functions

The popular BFGS quasi-Newton minimization algorithm under reasonable conditions converges globally on smooth convex functions. This result was proved by Powell in 1976: we consider its implications for functions that are not smooth. In particular, an analogous convergence result holds for functions, like the Euclidean norm, that are nonsmooth at the minimizer. Citation Manuscript: School … Read more