Global Convergence in Deep Learning with Variable Splitting via the Kurdyka-{\L}ojasiewicz Property

Deep learning has recently attracted a significant amount of attention due to its great empirical success. However, the effectiveness in training deep neural networks (DNNs) remains a mystery in the associated nonconvex optimizations. In this paper, we aim to provide some theoretical understanding on such optimization problems. In particular, the Kurdyka-{\L}ojasiewicz (KL) property is established … Read more

On Nonconvex Decentralized Gradient Descent

Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been proposed for {convex} consensus optimization. However, to the behaviors or consensus \emph{nonconvex} optimization, our understanding is more limited. When we lose convexity, we cannot hope our algorithms always return global solutions though they sometimes still do sometimes. Somewhat surprisingly, … Read more

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

In this paper, we analyze the convergence of the alternating direction method of multipliers (ADMM) for minimizing a nonconvex and possibly nonsmooth objective function, $\phi(x_0,\ldots,x_p,y)$, subject to coupled linear equality constraints. Our ADMM updates each of the primal variables $x_0,\ldots,x_p,y$, followed by updating the dual variable. We separate the variable $y$ from $x_i$’s as it … Read more