Stochastic Aspects of Dynamical Low-Rank Approximation in the Context of Machine Learning

The central challenges of today’s neural network architectures are the prohibitive memory footprint and the training costs associated with determining optimal weights and biases. A large portion of research in machine learning is therefore dedicated to constructing memory-efficient training methods. One promising approach is dynamical low-rank training (DLRT) which represents and trains parameters as a … Read more

Using Neural Networks to Guide Data-Driven Operational Decisions

We propose to use Deep Neural Networks to solve data-driven stochastic optimization problems. Given the historical data of the observed covariate, taken decision, and the realized cost in past periods, we train a neural network to predict the objective value as a function of the decision and the covariate. Once trained, for a given covariate, … Read more

A Fully Stochastic Second-Order Trust Region Method

A stochastic second-order trust region method is proposed, which can be viewed as a second-order extension of the trust-region-ish (TRish) algorithm proposed by Curtis et al. [INFORMS J. Optim. 1(3) 200–220, 2019]. In each iteration, a search direction is computed by (approximately) solving a trust region subproblem defined by stochastic gradient and Hessian estimates. The … Read more

HyperNOMAD: Hyperparameter optimization of deep neural networks using mesh adaptive direct search

The performance of deep neural networks is highly sensitive to the choice of the hyperparameters that define the structure of the network and the learning process. When facing a new application, tuning a deep neural network is a tedious and time consuming process that is often described as a “dark art”. This explains the necessity … Read more

A Stochastic Trust Region Algorithm Based on Careful Step Normalization

An algorithm is proposed for solving stochastic and finite sum minimization problems. Based on a trust region methodology, the algorithm employs normalized steps, at least as long as the norms of the stochastic gradient estimates are within a specified interval. The complete algorithm—which dynamically chooses whether or not to employ normalized steps—is proved to have … Read more

Exploiting Negative Curvature in Deterministic and Stochastic Optimization

This paper addresses the question of whether it can be beneficial for an optimization algorithm to follow directions of negative curvature. Although some prior work has established convergence results for algorithms that integrate both descent and negative curvature directions, there has not yet been numerical evidence showing that such methods offer significant performance improvements. In … Read more