A bound on the Carathéodory number

The Carathéodory number k(K) of a pointed closed convex cone K is the minimum among all the k for which every element of K can be written as a nonnegative linear combination of at most k elements belonging to extreme rays. Carathéodory’s Theorem gives the bound k(K) <= dim (K). In this work we observe … Read more

On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions

We consider the gradient (or steepest) descent method with exact line search applied to a strongly convex function with Lipschitz continuous gradient. We establish the exact worst-case rate of convergence of this scheme, and show that this worst-case behavior is exhibited by a certain convex quadratic function. We also extend the result to a noisy … Read more

Exact Worst-case Performance of First-order Methods for Composite Convex Optimization

We provide a framework for computing the exact worst-case performance of any algorithm belonging to a broad class of oracle-based first-order methods for composite convex optimization, including those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. We simultaneously obtain tight worst-case guarantees and explicit instances of optimization problems on which the algorithm reaches this … Read more

Convergence Analysis of ISTA and FISTA for “Strongly + Semi” Convex Programming

The iterative shrinkage/thresholding algorithm (ISTA) and its faster version FISTA have been widely used in the literature. In this paper, we consider general versions of the ISTA and FISTA in the more general “strongly + semi” convex setting, i.e., minimizing the sum of a strongly convex function and a semiconvex function; and conduct convergence analysis … Read more

Efficient Subgradient Methods for General Convex Optimization

A subgradient method is presented for solving general convex optimization problems, the main requirement being that a strictly-feasible point is known. A feasible sequence of iterates is generated, which converges to within user-specified error of optimality. Feasibility is maintained with a line-search at each iteration, avoiding the need for orthogonal projections onto the feasible region … Read more

Application of Facial Reduction to \infty$ State Feedback Control Problem

One often encounters numerical difficulties in solving linear matrix inequality (LMI) problems obtained from $H_\infty$ control problems. We discuss the reason from the viewpoint of optimization, and provide necessary and sufficient conditions for LMI problem and its dual not to be strongly feasible. Moreover, we interpret them in terms of control system. In this analysis, … Read more

Local Convergence Properties of Douglas–Rachford and ADMM

The Douglas–Rachford (DR) and alternating direction method of multipliers (ADMM) are two proximal splitting algorithms designed to minimize the sum of two proper lower semi-continuous convex functions whose proximity operators are easy to compute. The goal of this work is to understand the local linear convergence behaviour of DR/ADMM when the involved functions are moreover … Read more

Barzilai-Borwein Step Size for Stochastic Gradient Descent

One of the major issues in stochastic gradient descent (SGD) methods is how to choose an appropriate step size while running the algorithm. Since the traditional line search technique does not apply for stochastic optimization algorithms, the common practice in SGD is either to use a diminishing step size, or to tune a fixed step … Read more

Chebyshev Inequalities for Products of Random Variables

We derive sharp probability bounds on the tails of a product of symmetric non-negative random variables using only information about their first two moments. If the covariance matrix of the random variables is known exactly, these bounds can be computed numerically using semidefinite programming. If only an upper bound on the covariance matrix is available, … Read more

A unified convergence bound for conjugate gradient and accelerated gradient

Nesterov’s accelerated gradient method for minimizing a smooth strongly convex function $f$ is known to reduce $f(\x_k)-f(\x^*)$ by a factor of $\eps\in(0,1)$ after $k\ge O(\sqrt{L/\ell}\log(1/\eps))$ iterations, where $\ell,L$ are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computation. The … Read more