stochastic generalized gradient – Optimization Online

Stochastic generalized gradient methods for training nonconvex nonsmooth neural networks

Published: 2019/09/29

The paper observes a similarity between the stochastic optimal control of discrete dynamical systems and the learning multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. The machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, the so-called generalized … Read more

Substantiation of the Backpropagation Technique via the Hamilton-Pontryagin Formalism for Training Nonconvex Nonsmooth Neural Networks

Published: 2019/09/19

Vladimir I. Norkin

Convex and Nonsmooth Optimization, Nonlinear Optimization, Stochastic Programming deep learning, machine learning, multilayer neural networks, nonsmooth nonconvex optimization, stochastic generalized gradient, stochastic optimization

The paper observes the similarity between the stochastic optimal control of discrete dynamical systems and the training multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. In the paper, the machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, … Read more

Generalized Gradients in Problems of Dynamic Optimization, Optimal Control, and Machine Learning

Published: 2019/09/18, Updated: 2019/10/10

Vladimir I. Norkin

Nonsmooth Optimization, Stochastic Programming deep learning, dynamic programming, machine learning, multilayer neural networks, nonconvex nonsmooth optimization, optimal control, stochastic generalized gradient, stochastic optimization

In this work, nonconvex nonsmooth problems of dynamic optimization, optimal control in discrete time (including feedback control), and machine learning are considered from a common point of view. An analogy is observed between tasks of controlling discrete dynamic systems and training multilayer neural networks with nonsmooth target function and connections. Methods for calculating generalized gradients … Read more