Block Coordinate Descent Almost Surely Converges to a Stationary Point Satisfying the Second-order Necessary Condition

Given a non-convex twice continuously differentiable cost function with Lipschitz continuous gradient, we prove that all of the block coordinate gradient descent, block mirror descent and proximal block coordinate descent methods converge to stationary points satisfying the second-order necessary condition, almost surely with random initialization. All our results are ascribed to the center-stable manifold theorem … Read more

Run-and-Inspect Method for Nonconvex Optimization and Global Optimality Bounds for R-Local Minimizers

Many optimization algorithms converge to stationary points. When the underlying problem is nonconvex, they may get trapped at local minimizers and occasionally stagnate near saddle points. We propose the Run-and-Inspect Method, which adds an “inspect” phase to existing algorithms that helps escape from non-global stationary points. The inspection samples a set of points in a … Read more

Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

Machine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but … Read more

Globally Solving the Trust Region Subproblem Using Simple First-Order Methods

We consider the trust region subproblem which is given by a minimization of a quadratic, not necessarily convex, function over the Euclidean ball. Based on the well-known second-order necessary and sufficient optimality conditions for this problem, we present two sufficient optimality conditions defined solely in terms of the primal variables. Each of these conditions corresponds … Read more

An Inexact Regularized Newton Framework with a Worst-Case Iteration Complexity of $\mathcal{O}(\epsilon^{-3/2})$ for Nonconvex Optimization

An algorithm for solving smooth nonconvex optimization problems is proposed that, in the worst-case, takes $\mathcal{O}(\epsilon^{-3/2})$ iterations to drive the norm of the gradient of the objective function below a prescribed positive real number $\epsilon$ and can take $\mathcal{O}(\epsilon^{-3})$ iterations to drive the leftmost eigenvalue of the Hessian of the objective above $-\epsilon$. The proposed … Read more

Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization

We establish or refute the optimality of inexact second-order methods for unconstrained nonconvex optimization from the point of view of worst-case evaluation complexity, improving and generalizing the results of Cartis, Gould and Toint (2010,2011). To this aim, we consider a new general class of inexact second-order algorithms for unconstrained optimization that includes regularization and trust-region … Read more

Behavior of accelerated gradient methods near critical points of nonconvex functions

We examine the behavior of accelerated gradient methods in smooth nonconvex unconstrained optimization, focusing in particular on their behavior near strict saddle points. Accelerated methods are iterative methods that typically step along a direction that is a linear combination of the previous step and the gradient of the function evaluated at a point at or … Read more

Iteration-complexity of a Jacobi-type non-Euclidean ADMM for multi-block linearly constrained nonconvex programs

This paper establishes the iteration-complexity of a Jacobi-type non-Euclidean proximal alternating direction method of multipliers (ADMM) for solving multi-block linearly constrained nonconvex programs. The subproblems of this ADMM variant can be solved in parallel and hence the method has great potential to solve large scale multi-block linearly constrained nonconvex programs. Moreover, our analysis allows the … Read more

Iteration-Complexity of a Linearized Proximal Multiblock ADMM Class for Linearly Constrained Nonconvex Optimization Problems

This paper analyzes the iteration-complexity of a class of linearized proximal multiblock alternating direction method of multipliers (ADMM) for solving linearly constrained nonconvex optimization problems. The subproblems of the linearized ADMM are obtained by partially or fully linearizing the augmented Lagrangian with respect to the corresponding minimizing block variable. The derived complexity bounds do not … Read more

A Novel Approach for Solving Convex Problems with Cardinality Constraints

In this paper we consider the problem of minimizing a convex differentiable function subject to sparsity constraints. Such constraints are non-convex and the resulting optimization problem is known to be hard to solve. We propose a novel generalization of this problem and demonstrate that it is equivalent to the original sparsity-constrained problem if a certain … Read more