A First-Order Smoothing Technique for a Class of Large-Scale Linear Programs

We study a class of linear programming (LP) problems motivated by large-scale machine learning applications. After reformulating the LP as a convex nonsmooth problem, we apply Nesterov’s primal-dual smoothing technique. It turns out that the iteration complexity of the smoothing technique depends on a parameter $\th$ that arises because we need to bound the originally … Read more

A proximal point algorithm for sequential feature extraction applications

We propose a proximal point algorithm to solve LAROS problem, that is the problem of finding a “large approximately rank-one submatrix”. This LAROS problem is used to sequentially extract features in data. We also develop a new stopping criterion for the proximal point algorithm, which is based on the duality conditions of \eps-optimal solutions of … Read more

Manifold Identification in Dual Averaging for Regularized Stochastic Online Learning

Iterative methods that calculate their steps from approximate subgradient directions have proved to be useful for stochastic learning problems over large and streaming data sets. When the objective consists of a loss function plus a nonsmooth regularization term, the solution often lies on a low-dimensional manifold of parameter space along which the regularizer is smooth. … Read more

HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performance-destroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implemented *without … Read more

Parallel Stochastic Gradient Algorithms for Large-Scale Matrix Completion

This paper develops Jellyfish, an algorithm for solving data-processing problems with matrix-valued decision variables regularized to have low rank. Particular examples of problems solvable by Jellyfish include matrix completion problems and least-squares problems regularized by the nuclear norm or the max-norm. Jellyfish implements a projected incremental gradient method with a biased, random ordering of the … Read more

Optimal Distributed Online Prediction using Mini-Batches

Online prediction methods are typically presented as serial algorithms running on a single processor. However, in the age of web-scale prediction problems, it is increasingly common to encounter situations where a single processor cannot keep up with the high rate at which inputs arrive. In this work we present the distributed mini-batch algorithm, a method … Read more

An Alternating Direction Algorithm for Matrix Completion with Nonnegative Factors

This paper introduces a novel algorithm for the nonnegative matrix factorization and completion problem, which aims to nd nonnegative matrices X and Y from a subset of entries of a nonnegative matrix M so that XY approximates M. This problem is closely related to the two existing problems: nonnegative matrix factorization and low-rank matrix completion, … Read more

Finding approximately rank-one submatrices with the nuclear norm and l1 norm

We propose a convex optimization formulation with the nuclear norm and $\ell_1$-norm to find a large approximately rank-one submatrix of a given nonnegative matrix. We develop optimality conditions for the formulation and characterize the properties of the optimal solutions. We establish conditions under which the optimal solution of the convex formulation has a specific sparse … Read more

Calibrating Artificial Neural Networks by Global Optimization

An artificial neural network (ANN) is a computational model – implemented as a computer program – that is aimed at emulating the key features and operations of biological neural networks. ANNs are extensively used to model unknown or unspecified functional relationships between the input and output of a “black box” system. In order to apply … Read more

Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: a Generic Algorithmic Framework

In this paper we present a generic algorithmic framework, namely, the accelerated stochastic approximation (AC-SA) algorithm, for solving strongly convex stochastic composite optimization (SCO) problems. While the classical stochastic approximation (SA) algorithms are asymptotically optimal for solving differentiable and strongly convex problems, the AC-SA algorithm, when employed with proper stepsize policies, can achieve optimal or … Read more