Optimal Extragradient-Based Stochastic Bilinearly-Coupled Saddle-Point Optimization

We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $\min_{\mathbf{x}}\max_{\mathbf{y}}~F(\mathbf{x}) + H(\mathbf{x},\mathbf{y}) – G(\mathbf{y})$, where one has access to stochastic first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. Building upon standard stochastic extragradient analysis for variational inequalities, we present a stochastic \emph{accelerated gradient-extragradient (AG-EG)} descent-ascent algorithm that combines extragradient and Nesterov’s … Read more

ROOT-SGD: Sharp Nonasymptotics and Asymptotic Efficiency in a Single Algorithm

We study the problem of solving strongly convex and smooth unconstrained optimization problems using stochastic first-order algorithms. We devise a novel algorithm, referred to as \emph{Recursive One-Over-T SGD} (ROOTSGD), based on an easily implementable, recursive averaging of past stochastic gradients. We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and … Read more

Understanding the Acceleration Phenomenon via High-Resolution Differential Equations

Gradient-based optimization algorithms can be studied from the perspective of limiting or- dinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distin- guish between two fundamentally different algorithms—Nesterov’s accelerated gradient method for strongly convex functions (NAG-SC) and Polyak’s heavy-ball method—we study an alter- native limiting process that yields high-resolution ODEs. We … Read more

Gradient Descent only Converges to Minimizers

We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory. Article Download View Gradient Descent only Converges to Minimizers

A direct formulation for sparse PCA using semidefinite programming

We examine the problem of approximating, in the Frobenius-norm sense, a positive, semidefinite symmetric matrix by a rank-one matrix, with an upper bound on the cardinality of its eigenvector. The problem arises in the decomposition of a covariance matrix into sparse factors, and has wide applications ranging from biology to finance. We use a modification … Read more