Scaled Proximal Gradient Methods for Multiobjective Optimization: Improved Linear Convergence and Nesterov’s Acceleration

Over the past two decades, descent methods have received substantial attention within the multiobjective optimization field. Nonetheless, both theoretical analyses and empirical evidence reveal that existing first-order methods for multiobjective optimization converge slowly, even for well-conditioned problems, due to the objective imbalances. To address this limitation, we incorporate curvature information to scale each objective within … Read more

Lipschitz minimization and the Goldstein modulus

Goldstein’s 1977 idealized iteration for minimizing a Lipschitz objective fixes a distance – the step size – and relies on a certain approximate subgradient. That “Goldstein subgradient” is the shortest convex combination of objective gradients at points within that distance of the current iterate. A recent implementable Goldstein-style algorithm allows a remarkable complexity analysis (Zhang … Read more

Doubly stochastic primal dual splitting algorithm with variance reduction for saddle point problems

The structured saddle-point problem involving the infimal convolution in real Hilbert spaces finds applicability in many applied mathematics disciplines. For this purpose, we develop a stochastic primal-dual splitting algorithm with loopless variance-reduction for solving this generic problem. We first prove the weak almost sure convergence of the iterates. We then demonstrate that our algorithm achieves … Read more

A primal-dual majorization-minimization method for large-scale linear programs

We present a primal-dual majorization-minimization method for solving large-scale linear programs. A smooth barrier augmented Lagrangian (SBAL) function with strict convexity for the dual linear program is derived. The majorization-minimization approach is naturally introduced to develop the smoothness and convexity of the SBAL function. Our method only depends on a factorization of the constant matrix … Read more

Error Bound and Isocost Imply Linear Convergence of DCA-based Algorithms to D-stationarity

We consider a class of structured nonsmooth difference-of-convex minimization, which can be written as the difference of two convex functions possibly nonsmooth with the second one in the format of the maximum of a finite convex smooth functions. We propose two extrapolation proximal difference-of-convex based algorithms for potential acceleration to converge to a weak/standard d-stationary … Read more

A unified analysis of descent sequences in weakly convex optimization, including convergence rates for bundle methods

We present a framework for analyzing convergence and local rates of convergence of a class of descent algorithms, assuming the objective function is weakly convex. The framework is general, in the sense that it combines the possibility of explicit iterations (based on the gradient or a subgradient at the current iterate), implicit iterations (using a … Read more

Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization

For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is … Read more

Minimization of L1 over L2 for sparse signal recovery with convergence guarantee

The ratio of the $L_1$ and $L_2$ norms, denoted by $L_1/L_2$, becomes attractive due to its scale-invariant property when approximating the $L_0$ norm to promote sparsity. In this paper, we incorporate the $L_1/L_2$ formalism into an unconstrained model in order to deal with both noiseless and noisy observations. To design an efficient algorithm, we derive … Read more

On the linear convergence of the forward-backward splitting algorithm

In this paper, we establish a linear convergence result for the forward-backward splitting algorithm in the finding a zero of the sum of two maximal monotone operators, where one of them is set-valued strongly monotone and the other is Lipschitz continuous. We show that our convergence rate is better than Douglas–Rachford splitting algorithm’s rate used … Read more

Stochastic algorithms with geometric step decay converge linearly on sharp functions

Stochastic (sub)gradient methods require step size schedule tuning to perform well in practice. Classical tuning strategies decay the step size polynomially and lead to optimal sublinear rates on (strongly) convex problems. An alternative schedule, popular in nonconvex optimization, is called geometric step decay and proceeds by halving the step size after every few epochs. In … Read more