Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties

We describe an asynchronous parallel stochastic proximal coordinate descent algorithm for minimizing a composite objective function, which consists of a smooth convex function plus a separable convex function. In contrast to previous analyses, our model of asynchronous computation accounts for the fact that components of the unknown vector may be written by some cores simultaneously with being read by others. Despite the complications arising from this possibility, the method achieves a linear convergence rate on functions that satisfy an optimal strong convexity property and a sublinear rate ($1/k$) on general convex functions. Near-linear speedup on a multicore system can be expected if the number of processors is $O(n^{1/4})$. We describe results from implementation on ten cores of a multicore processor.

Citation

Department of Computer Sciences, University of Wisconsin-Madison, 1210 W. Dayton St., Madison, WI 53706-1685, US 03/14

Article

Download

View PDF