Benjamin Recht – Optimization Online

Gradient Descent only Converges to Minimizers

Published: 2016/02/17

Unconstrained Optimization gradient descent, local minimizer, nonconvex optimization, saddle point problem

We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory. ArticleDownload View PDF

Compressed Sensing Off the Grid

Published: 2012/09/13

Badri Narayan Bhaskar

Benjamin Recht

Parikshit Shah

Gongguo Tang

Convex Optimization, Data-Mining, Semi-definite Programming atomic norm, basis mismatch, compressed sensing, continuous dictionary, line spectral estimation, nuclear norm relaxation, prony’s method, sparsity

We consider the problem of estimating the frequency components of a mixture of s complex sinusoids from a random subset of n regularly spaced samples. Unlike previous work in compressed sensing, the frequencies are not assumed to lie on a grid, but can assume any values in the normalized frequency domain [0, 1]. We propose … Read more

Factoring nonnegative matrices with linear programs

Published: 2012/06/06

Data-Mining, Linear Programming linear programming, machine learning, multicore, nonnegative matrix factorization, parallel computing, stochastic gradient descent

This paper describes a new approach for computing nonnegative matrix factorizations (NMFs) with linear programming. The key idea is a data-driven model for the factorization, in which the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C that … Read more

Linear System Identification via Atomic Norm Regularization

Published: 2012/04/03

Badri Narayan Bhaskar

Benjamin Recht

Parikshit Shah

Gongguo Tang

Control Applications, Convex Optimization atomic norms, basis pursuit, hankel operators, system identification

This paper proposes a new algorithm for linear system identification from noisy measurements. The proposed algorithm balances a data fidelity term with a norm induced by the set of single pole filters. We pose a convex optimization problem that approximately solves the atomic norm minimization problem and identifies the unknown system from noisy linear measurements. … Read more

Decomposition Methods for Large Scale LP Decoding

Published: 2012/04/02

Applications - Science and Engineering, Linear Programming

When binary linear error-correcting codes are used over symmetric channels, a relaxed version of the maximum likelihood decoding problem can be stated as a linear program (LP). This LP decoder can be used to decode at bit-error-rates comparable to state-of-the-art belief propagation (BP) decoders, but with significantly stronger theoretical guarantees. However, LP decoding when implemented … Read more

Atomic norm denoising with applications to line spectral estimation

Published: 2012/04/02

Badri Narayan Bhaskar

Benjamin Recht

Gongguo Tang

Convex Optimization, Data-Mining

The sub-Nyquist estimation of line spectra is a classical problem in signal processing, but currently popular subspace-based techniques have few guarantees in the presence of noise and rely on a priori knowledge about system model order. Motivated by recent work on atomic norms in inverse problems, we propose a new approach to line spectral estimation … Read more

Beneath the valley of the noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences

Published: 2012/02/19

Christopher Re

Benjamin Recht

Nonlinear Optimization

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with- and without-replacement in such algorithms. Focusing on least means squares optimization, we formulate a … Read more

HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

Published: 2011/06/28, Updated: 2011/11/11

Data-Mining, Nonlinear Optimization, Stochastic Programming incremental gradient methods, machine learning, multicore, parallel computing, stochastic gradient descent

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performance-destroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implemented *without … Read more

Parallel Stochastic Gradient Algorithms for Large-Scale Matrix Completion

Published: 2011/04/26, Updated: 2013/03/22

Christopher Re

Benjamin Recht

Data-Mining, Nonlinear Optimization, Optimization Software and Modeling Systems incremental gradient methods, matrix completion, multicore, parallel computing

This paper develops Jellyfish, an algorithm for solving data-processing problems with matrix-valued decision variables regularized to have low rank. Particular examples of problems solvable by Jellyfish include matrix completion problems and least-squares problems regularized by the nuclear norm or the max-norm. Jellyfish implements a projected incremental gradient method with a biased, random ordering of the … Read more

The Convex Geometry of Linear Inverse Problems

Published: 2010/12/12

Venkat Chandrasekaran

Benjamin Recht

Pablo A. Parrilo

Alan S. Willsky

Applications - Science and Engineering, Convex Optimization, Semi-definite Programming

In applications throughout science and engineering one is often faced with the challenge of solving an ill-posed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constrained structurally so that they only have a few degrees … Read more