Stochastic trust-region and direct-search methods: A weak tail bound condition and reduced sample sizing

Using tail bounds, we introduce a new probabilistic condition for function estimation in stochastic derivative-free optimization which leads to a reduction in the number of samples and eases algorithmic analyses. Moreover, we develop simple stochastic direct-search and trust-region methods for the optimization of a potentially non-smooth function whose values can only be estimated via stochastic … Read more

Retraction based Direct Search Methods for Derivative Free Riemannian Optimization

Direct search methods represent a robust and reliable class of algorithms for solving black-box optimization problems. In this paper, we explore the application of those strategies to Riemannian optimization, wherein minimization is to be performed with respect to variables restricted to lie on a manifold. More specifically, we consider classic and line search extrapolated variants … Read more

A harmonic framework for stepsize selection in gradient methods

We study the use of inverse harmonic Rayleigh quotients with target for the stepsize selection in gradient methods for nonlinear unconstrained optimization problems. This provides not only an elegant and flexible framework to parametrize and reinterpret existing stepsize schemes, but also gives inspiration for new flexible and tunable families of steplengths. In particular, we analyze … Read more

Global convergence and acceleration of projection methods for feasibility problems involving union convex sets

We prove global convergence of classical projection algorithms for feasibility problems involving union convex sets, which refer to sets expressible as the union of a finite number of closed convex sets. We present a unified strategy for analyzing global convergence by means of studying fixed-point iterations of a set-valued operator that is the union of … Read more

Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization

In this work, we study the performance of sub-gradient method (SubGM) on a natural nonconvex and nonsmooth formulation of low-rank matrix recovery with $\ell_1$-loss, where the goal is to recover a low-rank matrix from a limited number of measurements, a subset of which may be grossly corrupted with noise. We study a scenario where the … Read more

New Penalized Stochastic Gradient Methods for Linearly Constrained Strongly Convex Optimization

For minimizing a strongly convex objective function subject to linear inequality constraints, we consider a penalty approach that allows one to utilize stochastic methods for problems with a large number of constraints and/or objective function terms. We provide upper bounds on the distance between the solutions to the original constrained problem and the penalty reformulations, … Read more

Efficiently Escaping Saddle Points in Bilevel Optimization

Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization focus on finding the first-order stationary points for nonconvex-strongly-convex cases. In this paper, we analyze algorithms that can escape saddle points in nonconvex-strongly-convex bilevel optimization. Specifically, we show that the perturbed approximate implicit differentiation (AID) with … Read more

An Adaptive Trust-Region Method Without Function Evaluations

In this paper we propose an adaptive trust-region method for smooth unconstrained optimization. The update rule for the trust-region radius relies only on gradient evaluations. Assuming that the gradient of the objective function is Lipschitz continuous, we establish worst-case complexity bounds for the number of gradient evaluations required by the proposed method to generate approximate … Read more

On a Computationally Ill-Behaved Bilevel Problem with a Continuous and Nonconvex Lower Level

It is well known that bilevel optimization problems are hard to solve both in theory and practice. In this paper, we highlight a further computational difficulty when it comes to solving bilevel problems with continuous but nonconvex lower levels. Even if the lower-level problem is solved to ɛ-feasibility regarding its nonlinear constraints for an arbitrarily … Read more