Consistency Bounds and Support Recovery of D-stationary Solutions of Sparse Sample Average Approximations

This paper studies properties of the d(irectional)-stationary solutions of sparse sample average approximation (SAA) problems involving difference-of-convex (dc) sparsity functions under a deterministic setting. Such properties are investigated with respect to a vector which satisfies a verifiable assumption to relate the empirical SAA problem to the expectation minimization problem defined by an underlying data distribution. … Read more

A Single Time-Scale Stochastic Approximation Method for Nested Stochastic Optimization

We study constrained nested stochastic optimization problems in which the objective function is a composition of two smooth functions whose exact values and derivatives are not available. We propose a single time-scale stochastic approximation algorithm, which we call the Nested Averaged Stochastic Approximation (NASA), to find an approximate stationary point of the problem. The algorithm … Read more

An efficient adaptive accelerated inexact proximal point method for solving linearly constrained nonconvex composite problems

This paper proposes an efficient adaptive variant of a quadratic penalty accelerated inexact proximal point (QP-AIPP) method proposed earlier by the authors. Both the QP-AIPP method and its variant solve linearly constrained nonconvex composite optimization problems using a quadratic penalty approach where the generated penalized subproblems are solved by a variant of the underlying AIPP … Read more

A Doubly Accelerated Inexact Proximal Point Method for Nonconvex Composite Optimization Problems

This paper describes and establishes the iteration-complexity of a doubly accelerated inexact proximal point (D-AIPP) method for solving the nonconvex composite minimization problem whose objective function is of the form f+h where f is a (possibly nonconvex) differentiable function whose gradient is Lipschitz continuous and h is a closed convex function with bounded domain. D-AIPP … Read more

A gradient type algorithm with backward inertial steps for a nonconvex minimization

We investigate an algorithm of gradient type with a backward inertial step in connection with the minimization of a nonconvex differentiable function. We show that the generated sequences converge to a critical point of the objective function, if a regularization of the objective function satis es the Kurdyka-Lojasiewicz property. Further, we provide convergence rates for the … Read more

Over-Parameterized Deep Neural Networks Have No Strict Local Minima For Any Continuous Activations

In this paper, we study the loss surface of the over-parameterized fully connected deep neural networks. We prove that for any continuous activation functions, the loss function has no bad strict local minimum, both in the regular sense and in the sense of sets. This result holds for any convex and differentiable loss function, and … Read more

Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints

We provide sharp worst-case evaluation complexity bounds for nonconvex minimization problems with general inexpensive constraints, i.e.\ problems where the cost of evaluating/enforcing of the (possibly nonconvex or even disconnected) constraints, if any, is negligible compared to that of evaluating the objective function. These bounds unify, extend or improve all known upper and lower complexity bounds … Read more

A Subsampling Line-Search Method with Second-Order Results

In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization such as a trust … Read more

Global Solutions of Nonconvex Standard Quadratic Programs via Mixed Integer Linear Programming Reformulations

A standard quadratic program is an optimization problem that consists of minimizing a (nonconvex) quadratic form over the unit simplex. We focus on reformulating a standard quadratic program as a mixed integer linear programming problem. We propose two alternative mixed integer linear programming formulations. Our first formulation is based on casting a standard quadratic program … Read more

Second-order Guarantees of Distributed gradient Algorithms

We consider distributed smooth nonconvex unconstrained optimization over networks, modeled as a connected graph. We examine the behavior of distributed gradient-based algorithms near strict saddle points. Specifically, we establish that (i) the renowned Distributed Gradient Descent (DGD) algorithm likely converges to a neighborhood of a Second-order Stationary (SoS) solution; and (ii) the more recent class … Read more