An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Structured non-convex learning problems, for which critical points have favorable statistical properties, arise frequently in statistical machine learning. Algorithmic convergence and statistical estimation rates are well-understood for such problems. However, quantifying the uncertainty associated with the underlying training algorithm is not well-studied in the non-convex setting. In order to address this short-coming, in this work, … Read more

A Framework for Adaptive Open-pit Mining Planning under Geological Uncertainty

Mine planning optimization aims at maximizing the profit obtained from extracting valuable ore. Beyond its theoretical complexity (the open-pit mining problem with capacity constraints reduces to a knapsack problem with precedence constraints, which is NP-hard), practical instances of the problem usually involve a large to very large number of decision variables, typically of the order … Read more

Expected complexity analysis of stochastic direct-search

This work presents the convergence rate analysis of stochastic variants of the broad class of direct-search methods of directional type. It introduces an algorithm designed to optimize differentiable objective functions $f$ whose values can only be computed through a stochastically noisy blackbox. The proposed stochastic directional direct-search (SDDS) algorithm accepts new iterates by imposing a … Read more

Zero Order Stochastic Weakly Convex Composite Optimization

In this paper we consider stochastic weakly convex composite problems, however without the existence of a stochastic subgradient oracle. We present a derivative free algorithm that uses a two point approximation for computing a gradient estimate of the smoothed function. We prove convergence at a similar rate as state of the art methods, however with … Read more

A Fully Stochastic Second-Order Trust Region Method

A stochastic second-order trust region method is proposed, which can be viewed as a second-order extension of the trust-region-ish (TRish) algorithm proposed by Curtis et al. [INFORMS J. Optim. 1(3) 200–220, 2019]. In each iteration, a search direction is computed by (approximately) solving a trust region subproblem defined by stochastic gradient and Hessian estimates. The … Read more

Adaptive Sampling Quasi-Newton Methods for Derivative-Free Stochastic Optimization

We consider stochastic zero-order optimization problems, which arise in settings from simulation optimization to reinforcement learning. We propose an adaptive sampling quasi-Newton method where we estimate the gradients of a stochastic function using finite differences within a common random number framework. We employ modified versions of a norm test and an inner product quasi-Newton test … Read more

Admissibility of solution estimators for stochastic optimization

We look at stochastic optimization problems through the lens of statistical decision theory. In particular, we address admissibility, in the statistical decision theory sense, of the natural sample average estimator for a stochastic optimization problem (which is also known as the empirical risk minimization (ERM) rule in learning literature). It is well known that for … Read more

Stochastic mesh adaptive direct search for blackbox optimization using probabilistic estimates

We present a stochastic extension of the mesh adaptive direct search (MADS) algorithm originally developed for deterministic blackbox optimization. The algorithm, called StoMADS, considers the unconstrained optimization of an objective function f whose values can be computed only through a blackbox corrupted by some random noise following an unknown distribution. The proposed method is based … Read more

Stochastic generalized gradient methods for training nonconvex nonsmooth neural networks

The paper observes a similarity between the stochastic optimal control of discrete dynamical systems and the learning multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. The machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, the so-called generalized … Read more

Substantiation of the Backpropagation Technique via the Hamilton-Pontryagin Formalism for Training Nonconvex Nonsmooth Neural Networks

The paper observes the similarity between the stochastic optimal control of discrete dynamical systems and the training multilayer neural networks. It focuses on contemporary deep networks with nonconvex nonsmooth loss and activation functions. In the paper, the machine learning problems are treated as nonconvex nonsmooth stochastic optimization problems. As a model of nonsmooth nonconvex dependences, … Read more