stochastic gradient descent – Optimization Online

Stochastic Aspects of Dynamical Low-Rank Approximation in the Context of Machine Learning

Published: 2024/03/23, Updated: 2024/05/15

Data Science Theory, Nonlinear Optimization, Optimization in Data Science deep neural networks, Dynamical Low-Rank Approximation (DLRA), Dynamical Low-Rank Training13 (DLRT), machine learning, stochastic gradient descent

The central challenges of today’s neural network architectures are the prohibitive memory footprint and the training costs associated with determining optimal weights and biases. A large portion of research in machine learning is therefore dedicated to constructing memory-efficient training methods. One promising approach is dynamical low-rank training (DLRT) which represents and trains parameters as a … Read more

Stability of Markovian Stochastic Programming

Published: 2023/07/31

David Wozabal

Stochastic Programming Fortet-Mourier distance, stochastic dual dynamic programming, stochastic gradient descent, wasserstein distance

Multi-stage stochastic programming is notoriously hard, since solution methods suffer from the curse of dimensionality. Recently, stochastic dual dynamic programming has shown promising results for Markovian problems with many stages and a moderately large state space. In order to numerically solve these problems simple discrete representations of Markov processes are required but a convincing theoretical … Read more

Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees

Published: 2023/05/18

Hamid Reza Feyzmahdavian

Mikael Johansson

Convex Optimization, Global Optimization, Parallel Algorithms asynchronous algorithms, coordinate descent, parallel methods, stochastic gradient descent, stochastic optimization

We introduce novel convergence results for asynchronous iterations that appear in the analysis of parallel and distributed optimization algorithms. The results are simple to apply and give explicit estimates for how the degree of asynchrony impacts the convergence rates of the iterates. Our results shorten, streamline and strengthen existing convergence proofs for several asynchronous optimization … Read more

Optimized convergence of stochastic gradient descent by weighted averaging

Published: 2022/09/23, Updated: 2022/10/05

Melinda Hagedorn

Florian Jarre

Convex Optimization, Data Science Theory, Stochastic Approaches convex optimization, noise, optimal step lengths, optimal weights, stochastic gradient descent, weighted averaging

Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the arithmetic mean of all iterates converges considerably slower to the optimal solution than the iterates themselves. And also in the … Read more

Bolstering Stochastic Gradient Descent with Model Building

Published: 2021/11/13, Updated: 2023/02/16

Unconstrained Optimization convergence analysis, model building, second-order information, stochastic gradient descent

Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are fine-tuned for the application at hand. Although this tuning process can require large computational costs, recent work has shown that these costs can be … Read more

Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems

Published: 2021/10/01, Updated: 2022/12/07

Tommaso Giovannelli

Griffin D. Kent

Luis Nunes Vicente

Data-Mining, Nonlinear Optimization, Stochastic Programming bilevel optimization, DARTS, machine learning, stochastic gradient descent

Two-level stochastic optimization formulations have become instrumental in a number ofmachine learning contexts such as continual learning, neural architecture search, adversariallearning, and hyperparameter tuning. Practical stochastic bilevel optimization problemsbecome challenging in optimization or learning scenarios where the number of variables ishigh or there are constraints. In this paper, we introduce a bilevel stochastic gradient method … Read more

The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning

Published: 2019/07/09

Suyun Liu

Luis Nunes Vicente

Convex Optimization, Multi-Criteria Optimization, Stochastic Programming multi-objective optimization, pareto front, stochastic gradient descent, supervised machine learning

Optimization of conflicting functions is of paramount importance in decision making, and real world applications frequently involve data that is uncertain or unknown, resulting in multi-objective optimization (MOO) problems of stochastic type. We study the stochastic multi-gradient (SMG) method, seen as an extension of the classical stochastic gradient method for single-objective optimization. At each iteration … Read more

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron

Published: 2019/02/25

Francis Bach

Mark Schmidt

Sharan Vaswani

Convex Optimization, Generalized Convexity/Monoticity interpolation, nesterov acceleration, over-parametrization, stochastic gradient descent

Modern machine learning focuses on highly expressive models that are able to fit or interpolate the data completely, resulting in zero training loss. For such models, we show that the stochastic gradients of common loss functions satisfy a strong growth condition. Under this condition, we prove that constant step-size stochastic gradient descent (SGD) with Nesterov … Read more

Machine learning approach to chance-constrained problems: An algorithm based on the stochastic gradient descent

Published: 2018/12/11, Updated: 2019/05/27

Lukáš Adam

Martin Branda

Stochastic Programming chance constraints, large-scale optimization, machine learning, quantile, stochastic gradient descent, stochastic programming

We consider chance-constrained problems with discrete random distribution. We aim for problems with a large number of scenarios. We propose a novel method based on the stochastic gradient descent method which performs updates of the decision variable based only on looking at a few scenarios. We modify it to handle the non-separable objective. A complexity … Read more

Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods

Published: 2018/10/19

Robert M. Freund

Rahul Mazumder

Paul Grigas

Convex Optimization, Data-Mining, Statistics condition numbers, logistic regression, steepest descent, stochastic gradient descent

Logistic regression is one of the most popular methods in binary classification, wherein estimation of model parameters is carried out by solving the maximum likelihood (ML) optimization problem, and the ML estimator is defined to be the optimal solution of this problem. It is well known that the ML estimator exists when the data is … Read more