Riemannian trust-region methods for strict saddle functions with complexity guarantees

The difficulty of minimizing a nonconvex function is in part explained by the presence of saddle points. This slows down optimization algorithms and impacts worst-case complexity guarantees. However, many nonconvex problems of interest possess a favorable structure for optimization, in the sense that saddle points can be escaped efficiently by appropriate algorithms. This strict saddle … Read more

Riemannian Bilevel Optimization

In this work, we consider the bilevel optimization problem on Riemannian manifolds. We inspect the calculation of the hypergradient of such problems on general manifolds and thus enable the utilization of gradient-based algorithms to solve such problems. The calculation of the hypergradient requires utilizing the notion of Riemannian cross-derivative and we inspect the properties and … Read more

A worst-case complexity analysis for Riemannian non-monotone line-search methods

In this paper we deal with non-monotone line-search methods to minimize a smooth cost function on a Riemannian manifold. In particular, we study the number of iterations necessary for this class of algorithms to obtain e-approximated stationary points. Specifically, we prove that under a regularity Lipschitz-type condition on the pullbacks of the cost function to … Read more

Accelerated gradient methods on the Grassmann and Stiefel manifolds

In this paper we extend a nonconvex version of Nesterov’s accelerated gradient (AG) method to optimization over the Grassmann and Stiefel manifolds. We propose an exponential-based AG algorithm for the Grassmann manifold and a retraction-based AG algorithm that exploits the Cayley transform for both of the Grassmann and Stiefel manifolds. Under some mild assumptions, we … Read more

Federated Learning on Riemannian Manifolds

Federated learning (FL) has found many important applications in smart-phone-APP based machine learning applications. Although many algorithms have been studied for FL, to the best of our knowledge, algorithms for FL with nonconvex constraints have not been studied. This paper studies FL over Riemannian manifolds, which finds important applications such as federated PCA and federated … Read more

An Adaptive Riemannian Gradient Method Without Function Evaluations

In this paper we propose an adaptive gradient method for optimization on Riemannian manifolds. The update rule for the stepsizes relies only on gradient evaluations. Assuming that the objective function is bounded from below and that its gradient field is Lipschitz continuous, we establish worst-case complexity bounds for the number of gradient evaluations that the … Read more

Dissolving Constraints for Riemannian Optimization

In this paper, we consider optimization problems over closed embedded submanifolds of $\mathbb{R}^n$, which are defined by the constraints $c(x) = 0$. We propose a class of constraint dissolving approaches for these Riemannian optimization problems. In these proposed approaches, solving a Riemannian optimization problem is transferred into the unconstrained minimization of a constraint dissolving function … Read more

Solving Optimization Problems over the Stiefel Manifold by Smooth Exact Penalty Function

In this paper, we present a novel penalty model called ExPen for optimization over the Stiefel manifold. Different from existing penalty functions for orthogonality constraints, ExPen adopts a smooth penalty function without using any first-order derivative of the objective function. We show that all the first-order stationary points of ExPen with a sufficiently large penalty … Read more

A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance

The Wasserstein distance has become increasingly important in machine learning and deep learning. Despite its popularity, the Wasserstein distance is hard to approximate because of the curse of dimensionality. A recently proposed approach to alleviate the curse of dimensionality is to project the sampled data from the high dimensional probability distribution onto a lower-dimensional subspace, … Read more

Projection Robust Wasserstein Barycenters

Collecting and aggregating information from several probability measures or histograms is a fundamental task in machine learning. One of the popular solution methods for this task is to compute the barycenter of the probability measures under the Wasserstein metric. However, approximating the Wasserstein barycenter is numerically challenging because of the curse of dimensionality. This paper … Read more