Discerning the linear convergence of ADMM for structured convex optimization through the lens of variational analysis

Despite the rich literature, the linear convergence of alternating direction method of multipliers (ADMM) has not been fully understood even for the convex case. For example, the linear convergence of ADMM can be empirically observed in a wide range of applications, while existing theoretical results seem to be too stringent to be satisfied or too … Read more

Global Convergence in Deep Learning with Variable Splitting via the Kurdyka-{\L}ojasiewicz Property

Deep learning has recently attracted a significant amount of attention due to its great empirical success. However, the effectiveness in training deep neural networks (DNNs) remains a mystery in the associated nonconvex optimizations. In this paper, we aim to provide some theoretical understanding on such optimization problems. In particular, the Kurdyka-{\L}ojasiewicz (KL) property is established … Read more

Understanding the Acceleration Phenomenon via High-Resolution Differential Equations

Gradient-based optimization algorithms can be studied from the perspective of limiting or- dinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distin- guish between two fundamentally different algorithms—Nesterov’s accelerated gradient method for strongly convex functions (NAG-SC) and Polyak’s heavy-ball method—we study an alter- native limiting process that yields high-resolution ODEs. We … Read more

Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods

Logistic regression is one of the most popular methods in binary classification, wherein estimation of model parameters is carried out by solving the maximum likelihood (ML) optimization problem, and the ML estimator is defined to be the optimal solution of this problem. It is well known that the ML estimator exists when the data is … Read more

Exploiting Low-Rank Structure in Semidefinite Programming by Approximate Operator Splitting

In contrast with many other convex optimization classes, state-of-the-art semidefinite programming solvers are yet unable to efficiently solve large scale instances. This work aims to reduce this scalability gap by proposing a novel proximal algorithm for solving general semidefinite programming problems. The proposed methodology, based on the primal-dual hybrid gradient method, allows the presence of … Read more

POLO: a POLicy-based Optimization library

We present POLO — a C++ library for large-scale parallel optimization research that emphasizes ease-of-use, flexibility and efficiency in algorithm design. It uses multiple inheritance and template programming to decompose algorithms into essential policies and facilitate code reuse. With its clear separation between algorithm and execution policies, it provides researchers with a simple and powerful … Read more

Low-M-Rank Tensor Completion and Robust Tensor PCA

In this paper, we propose a new approach to solve low-rank tensor completion and robust tensor PCA. Our approach is based on some novel notion of (even-order) tensor ranks, to be called the M-rank, the symmetric M-rank, and the strongly symmetric M-rank. We discuss the connections between these new tensor ranks and the CP-rank and … Read more

Analysis of Limited-Memory BFGS on a Class of Nonsmooth Convex Functions

The limited memory BFGS (L-BFGS) method is widely used for large-scale unconstrained optimization, but its behavior on nonsmooth problems has received little attention. L-BFGS can be used with or without “scaling”; the use of scaling is normally recommended. A simple special case, when just one BFGS update is stored and used at every iteration, is … Read more

Hamiltonian Descent Methods

We propose a family of optimization methods that achieve linear convergence using first-order gradient information and constant step sizes on a class of convex functions much larger than the smooth and strongly convex ones. This larger class includes functions whose second derivatives may be singular or unbounded at their minima. Our methods are discretizations of … Read more

An inertial extrapolation method for convex simple bilevel optimization

We consider a scalar objective minimization problem over the solution set of another optimization problem. This problem is known as simple bilevel optimization problem and has drawn a significant attention in the last few years. Our inner problem consists of minimizing the sum of smooth and nonsmooth functions while the outer one is the minimization … Read more