Recognizing weighted means in geodesic spaces

Geodesic metric spaces support a variety of averaging constructions for given finite sets. Computing such averages has generated extensive interest in diverse disciplines. Here we consider the inverse problem of recognizing computationally whether or not a given point is such an average, exactly or approximately. In nonpositively curved spaces, several averaging notions, including the usual … Read more

Lipschitz minimization and the Goldstein modulus

Goldstein’s 1977 idealized iteration for minimizing a Lipschitz objective fixes a distance – the step size – and relies on a certain approximate subgradient. That “Goldstein subgradient” is the shortest convex combination of objective gradients at points within that distance of the current iterate. A recent implementable Goldstein-style algorithm allows a remarkable complexity analysis (Zhang … Read more

Convex optimization on CAT(0) cubical complexes

We consider geodesically convex optimization problems involving distances to a finite set of points A in a CAT(0) cubical complex. Examples include the minimum enclosing ball problem, the weighted mean and median problems, and the feasibility and projection problems for intersecting balls with centers in A. We propose a decomposition approach relying on standard Euclidean … Read more

Horoballs and the subgradient method

To explore convex optimization on Hadamard spaces, we consider an iteration in the style of a subgradient algorithm. Traditionally, such methods assume that the underlying spaces are manifolds and that the objectives are geodesically convex: the methods are described using tangent spaces and exponential maps. By contrast, our iteration applies in a general Hadamard space, … Read more

The complexity of first-order optimization methods from a metric perspective

A central tool for understanding first-order optimization algorithms is the Kurdyka-Lojasiewicz inequality. Standard approaches to such methods rely crucially on this inequality to leverage sufficient decrease conditions involving gradients or subgradients. However, the KL property fundamentally concerns not subgradients but rather “slope”, a purely metric notion. By highlighting this view, and avoiding any use of … Read more

The cost of nonconvexity in deterministic nonsmooth optimization

\(\) We study the impact of nonconvexity on the complexity of nonsmooth optimization, emphasizing objectives such as piecewise linear functions, which may not be weakly convex. We focus on a dimension-independent analysis, slightly modifying a black-box algorithm of Zhang et al. that approximates an $\epsilon$-stationary point of any directionally differentiable Lipschitz objective using $O(\epsilon^{-4})$ calls … Read more

Identifiability, the KL property in metric spaces, and subgradient curves

Identifiability, and the closely related idea of partial smoothness, unify classical active set methods and more general notions of solution structure. Diverse optimization algorithms generate iterates in discrete time that are eventually confined to identifiable sets. We present two fresh perspectives on identifiability. The first distills the notion to a simple metric property, applicable not … Read more

Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization

For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is … Read more

The structure of conservative gradient fields

The classical Clarke subdifferential alone is inadequate for understanding automatic differentiation in nonsmooth contexts. Instead, we can sometimes rely on enlarged generalized gradients called “conservative fields”, defined through the natural path-wise chain rule: one application is the convergence analysis of gradient-based deep learning algorithms. In the semi-algebraic case, we show that all conservative fields are … Read more

Disk matrices and the proximal mapping for the numerical radius

Optimal matrices for problems involving the matrix numerical radius often have fields of values that are disks, a phenomenon associated with partial smoothness. Such matrices are highly structured: we experiment in particular with the proximal mapping for the radius, which often maps n-by-n random matrix inputs into a particular manifold of disk matrices that has … Read more