Global non-asymptotic super-linear convergence rates of regularized proximal quasi-Newton methods on non-smooth composite problems

\(\) In this paper, we propose two regularized proximal quasi-Newton methods with symmetric rank-1 update of the metric (SR1 quasi-Newton) to solve non-smooth convex additive composite problems. Both algorithms avoid using line search or other trust region strategies. For each of them, we prove a super-linear convergence rate that is independent of the initialization of … Read more

Learning-to-Optimize with PAC-Bayesian Guarantees: Theoretical Considerations and Practical Implementation

We use the PAC-Bayesian theory for the setting of learning-to-optimize. To the best of our knowledge, we present the first framework to learn optimization algorithms with provable generalization guarantees (PAC-Bayesian bounds) and explicit trade-off between convergence guarantees and convergence speed, which contrasts with the typical worst-case analysis. Our learned optimization algorithms provably outperform related ones … Read more

The stochastic Ravine accelerated gradient method with general extrapolation coefficients

Abstract: In a real Hilbert space domain setting, we study the convergence properties of the stochastic Ravine accelerated gradient method for convex differentiable optimization. We consider the general form of this algorithm where the extrapolation coefficients can vary with each iteration, and where the evaluation of the gradient is subject to random errors. This general … Read more

Accelerated Gradient Dynamics on Riemannian Manifolds: Faster Rate and Trajectory Convergence

In order to minimize a differentiable geodesically convex function, we study a second-order dynamical system on Riemannian manifolds with an asymptotically vanishing damping term of the form \(\alpha/t\). For positive values of \(\alpha\), convergence rates for the objective values and convergence of trajectory is derived. We emphasize the crucial role of the curvature of the … Read more

A Stochastic Bregman Primal-Dual Splitting Algorithm for Composite Optimization

We study a stochastic first order primal-dual method for solving convex-concave saddle point problems over real reflexive Banach spaces using Bregman divergences and relative smoothness assumptions, in which we allow for stochastic error in the computation of gradient terms within the algorithm. We show ergodic convergence in expectation of the Lagrangian optimality gap with a … Read more

Inexact and Stochastic Generalized Conditional Gradient with Augmented Lagrangian and Proximal Step

In this paper we propose and analyze inexact and stochastic versions of the CGALP algorithm developed in the authors’ previous paper, which we denote ICGALP, that allows for errors in the computation of several important quantities. In particular this allows one to compute some gradients, proximal terms, and/or linear minimization oracles in an inexact fashion … Read more

Generalized Conditional Gradient with Augmented Lagrangian for Composite Minimization

In this paper we propose a splitting scheme which hybridizes generalized conditional gradient with a proximal step which we call CGALP algorithm, for minimizing the sum of three proper convex and lower-semicontinuous functions in real Hilbert spaces. The minimization is subject to an affine constraint, that allows in particular to deal with composite problems (sum … Read more

On Quasi-Newton Forward–Backward Splitting: Proximal Calculus and Convergence

We introduce a framework for quasi-Newton forward–backward splitting algorithms (proximal quasi-Newton methods) with a metric induced by diagonal +/- rank-r symmetric positive definite matrices. This special type of metric allows for a highly efficient evaluation of the proximal mapping. The key to this efficiency is a general proximal calculus in the new metric. By using … Read more

Convergence rates of Forward-Douglas-Rachford splitting method

Over the past years, operator splitting methods have become ubiquitous for non-smooth optimization owing to their simplicity and efficiency. In this paper, we consider the Forward–Douglas–Rachford splitting method (FDR) [10, 40], and study both global and local convergence rates of this method. For the global rate, we establish an o(1/k) convergence rate in terms of … Read more

Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms

We propose a unifying algorithm for non-smooth non-convex optimization. The algorithm approximates the objective function by a convex model function and finds an approximate (Bregman) proximal point of the convex model. This approximate minimizer of the model function yields a descent direction, along which the next iterate is found. Complemented with an Armijo-like line search … Read more