Fixed-Point Automatic Differentiation of Forward–Backward Splitting Algorithms for Partly Smooth Functions

A large class of non-smooth practical optimization problems can be written as minimization of a sum of smooth and partly smooth functions. We consider such structured problems which also depend on a parameter vector and study the problem of differentiating its solution mapping with respect to the parameter which has far reaching applications in sensitivity … Read more

Optimized convergence of stochastic gradient descent by weighted averaging

Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the arithmetic mean of all iterates converges considerably slower to the optimal solution than the iterates themselves. And also in the … Read more

Convergence rate analysis of the gradient descent-ascent method for convex-concave saddle-point problems

In this paper, we study the gradient descent-ascent method for convex-concave saddle-point problems. We derive a new non-asymptotic global convergence rate in terms of distance to the solution set by using the semidefinite programming performance estimation method. The given convergence rate incorporates most parameters of the problem and it is exact for a large class … Read more

Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization

The Frank-Wolfe method has become increasingly useful in statistical and machine learning applications, due to the structure-inducing properties of the iterates, and especially in settings where linear minimization over the feasible set is more computationally efficient than projection. In the setting of Empirical Risk Minimization — one of the fundamental optimization problems in statistical and … Read more

Finite convergence of the inexact proximal gradient method to sharp minima

Attractive properties of subgradient methods, such as robust stability and linear convergence, has been emphasized when they are used to solve nonsmooth optimization problems with sharp minima [12, 13]. In this letter we extend the robustness results to the composite convex models and show that the basic proximal gradient algorithm under the presence of a … Read more

On the first order optimization methods in Deep Image Prior

Deep learning methods have state-of-the-art performances in many image restoration tasks. Their effectiveness is mostly related to the size of the dataset used for the training. Deep Image Prior (DIP) is an energy function framework which eliminates the dependency on the training set, by considering the structure of a neural network as an handcrafted prior … Read more

On Optimal Universal First-Order Methods for Minimizing Heterogeneous Sums

This work considers minimizing a convex sum of functions, each with potentially different structure ranging from nonsmooth to smooth, Lipschitz to non-Lipschitz. Nesterov’s universal fast gradient method provides an optimal black-box first-order method for minimizing a single function that takes advantage of any continuity structure present without requiring prior knowledge. In this paper, we show … Read more

Weighted Geometric Mean, Minimum Mediated Set, and Optimal Second-Order Cone Representation

We study optimal second-order cone representations for weighted geometric means, which turns out to be closely related to minimum mediated sets. Several lower bounds and upper bounds on the size of optimal second-order cone representations are proved. In the case of bivariate weighted geometric means (equivalently, one dimensional mediated sets), we are able to prove … Read more

The Hyperbolic Augmented Lagrangian Algorithm

The hyperbolic augmented Lagrangian algorithm (HALA) is introduced in the area of continuous optimization for solving nonlinear programming problems. Under mild assumptions, such as: convexity, Slater’s qualification and differentiability, the convergence of the proposed algorithm is proved. We also study the duality theory for the case of the hyperbolic augmented Lagrangian function. Finally, in order … Read more