Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the Optimization Landscape Around the True Solution

This work characterizes the effect of depth on the optimization landscape of linear regression, showing that, despite their nonconvexity, deeper models have more desirable optimization landscape. We consider a robust and over-parameterized setting, where a subset of measurements are grossly corrupted with noise and the true linear model is captured via an $N$-layer linear neural … Read more

Preconditioned Gradient Descent for Overparameterized Nonconvex Burer–Monteiro Factorization with Global Optimality Certification

We consider using gradient descent to minimize the nonconvex function $f(X)=\phi(XX^{T})$ over an $n\times r$ factor matrix $X$, in which $\phi$ is an underlying smooth convex cost function defined over $n\times n$ matrices. While only a second-order stationary point $X$ can be provably found in reasonable time, if $X$ is additionally \emph{rank deficient}, then its … Read more

Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization

In this work, we study the performance of sub-gradient method (SubGM) on a natural nonconvex and nonsmooth formulation of low-rank matrix recovery with $\ell_1$-loss, where the goal is to recover a low-rank matrix from a limited number of measurements, a subset of which may be grossly corrupted with noise. We study a scenario where the … Read more

A Graph-based Decomposition Method for Convex Quadratic Optimization with Indicators

In this paper, we consider convex quadratic optimization problems with indicator variables when the matrix Q defining the quadratic term in the objective is sparse. We use a graphical representation of the support of Q, and show that if this graph is a path, then we can solve the associated problem in polynomial time. This … Read more

Implicit Regularization of Sub-Gradient Method in Robust Matrix Recovery: Don’t be Afraid of Outliers

It is well-known that simple short-sighted algorithms, such as gradient descent, generalize well in the over-parameterized learning tasks, due to their implicit regularization. However, it is unknown whether the implicit regularization of these algorithms can be extended to robust learning tasks, where a subset of samples may be grossly corrupted with noise. In this work, … Read more

Scalable Inference of Sparsely-changing Markov Random Fields with Strong Statistical Guarantees

In this paper, we study the problem of inferring time-varying Markov random fields (MRF), where the underlying graphical model is both sparse and changes sparsely over time. Most of the existing methods for the inference of time-varying MRFs rely on the regularized maximum likelihood estimation (MLE), that typically suffer from weak statistical guarantees and high … Read more