Spurious Local Minima Exist for Almost All Over-parameterized Neural Networks

A popular belief for explaining the efficiency in training deep neural networks is that over-paramenterized neural networks have nice landscape. However, it still remains unclear whether over-parameterized neural networks contain spurious local minima in general, since all current positive results cannot prove non-existence of bad local minima, and all current negative results have strong restrictions … Read more

Stochastic mesh adaptive direct search for blackbox optimization using probabilistic estimates

We present a stochastic extension of the mesh adaptive direct search (MADS) algorithm originally developed for deterministic blackbox optimization. The algorithm, called StoMADS, considers the unconstrained optimization of an objective function f whose values can be computed only through a blackbox corrupted by some random noise following an unknown distribution. The proposed method is based … Read more

Dynamic Optimization with Complementarity Constraints: Smoothing for Direct Shooting

We consider optimization of differential-algebraic equations (DAEs) with complementarity constraints (CCs) of algebraic state pairs. Formulating the CCs as smoothed nonlinear complementarity problem (NCP) functions leads to a smooth DAE, allowing for the solution in direct shooting. We provide sufficient conditions for well-posedness. Thus, we can prove that with the smoothing parameter going to zero, … Read more

Continuous selections of solutions for locally Lipschitzian equations

This paper answers in affirmative the long-standing question of nonlinear analysis, concerning the existence of a continuous single-valued local selection of the right inverse to a locally Lipschitzian mapping. Moreover, we develop a much more general result, providing conditions for the existence of a continuous single-valued selection not only locally, but rather on any given … Read more

Stochastic generalized gradient methods for training nonconvex nonsmooth neural networks

The paper observes a similarity between the stochastic optimal control of discrete dynamical systems and the learning multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. The machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, the so-called generalized … Read more

An analysis of the superiorization method via the principle of concentration of measure

The superiorization methodology is intended to work with input data of constrained minimization problems, i.e., a target function and a constraints set. However, it is based on an antipodal way of thinking to the thinking that leads constrained minimization methods. Instead of adapting unconstrained minimization algorithms to handling constraints, it adapts feasibility-seeking algorithms to reduce … Read more

An Infeasible Interior-point Arc-search Algorithm for Nonlinear Constrained Optimization

In this paper, we propose an infeasible arc-search interior-point algorithm for solving nonlinear programming problems. Most algorithms based on interior-point methods are categorized as line search in the sense that they compute a next iterate on a straight line determined by a search direction which approximates the central path. The proposed arc-search interior-point algorithm uses … Read more

Worst-case Complexity Bounds of Directional Direct-search Methods for Multiobjective Optimization

Direct Multisearch is a well-established class of algorithms, suited for multiobjective derivative-free optimization. In this work, we analyze the worst-case complexity of this class of methods in its most general formulation for unconstrained optimization. Considering nonconvex smooth functions, we show that to drive a given criticality measure below a specific positive threshold, Direct Multisearch takes … Read more

Substantiation of the Backpropagation Technique via the Hamilton-Pontryagin Formalism for Training Nonconvex Nonsmooth Neural Networks

The paper observes the similarity between the stochastic optimal control of discrete dynamical systems and the training multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. In the paper, the machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, … Read more

On the Convergence to Stationary Points of Deterministic and Randomized Feasible Descent Directions Methods

This paper studies the class of nonsmooth nonconvex problems in which the difference between a continuously differentiable function and a convex nonsmooth function is minimized over linear constraints. Our goal is to attain a point satisfying the stationarity necessary optimality condition, defined as the lack of feasible descent directions. Although elementary in smooth optimization, this … Read more