A Study of First-Order Methods with a Probabilistic Relative-Error Gradient Oracle

This paper investigates the problem of minimizing a smooth function over a compact set with a probabilistic relative-error gradient oracle. The oracle succeeds with some probability, in which case it provides a relative-error approximation of the true gradient, or fails and returns an arbitrary vector, while the optimizer cannot distinguish between successful and failed queries … Read more

A Sound Local Regret Methodology for Online Nonconvex Composite Optimization

Online nonconvex optimization addresses dynamic and complex decision-making problems arising in real-world decision-making tasks where the optimizer’s objective evolves with the intricate and changing nature of the underlying system. This paper studies an online nonconvex composite optimization model with limited first-order access, encompassing a wide range of practical scenarios. We define local regret using a … Read more

Alternate Training of Shared and Task-Specific Parameters for Multi-Task Neural Networks

This paper introduces novel alternate training procedures for hard-parameter sharing Multi-Task Neural Networks (MTNNs). Traditional MTNN training faces challenges in managing conflicting loss gradients, often yielding sub-optimal performance. The proposed alternate training method updates shared and task-specific weights alternately, exploiting the multi-head architecture of the model. This approach reduces computational costs, enhances training regularization, and … Read more

A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems

A stochastic-gradient-based interior-point algorithm for minimizing a continuously differentiable objective function (that may be nonconvex) subject to bound constraints is presented, analyzed, and demonstrated through experimental results. The algorithm is unique from other interior-point methods for solving smooth (nonconvex) optimization problems since the search directions are computed using stochastic gradient estimates. It is also unique … Read more

A momentum-based linearized augmented Lagrangian method for nonconvex constrained stochastic optimization

Nonconvex constrained stochastic optimization has emerged in many important application areas. Subject to general functional constraints it minimizes the sum of an expectation function and a nonsmooth regularizer. Main challenges arise due to the stochasticity in the random integrand and the possibly nonconvex functional constraints. To address these issues we propose a momentum-based linearized augmented … Read more

New Penalized Stochastic Gradient Methods for Linearly Constrained Strongly Convex Optimization

For minimizing a strongly convex objective function subject to linear inequality constraints, we consider a penalty approach that allows one to utilize stochastic methods for problems with a large number of constraints and/or objective function terms. We provide upper bounds on the distance between the solutions to the original constrained problem and the penalty reformulations, … Read more

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to … Read more

Inexact proximal stochastic second-order methods for nonconvex composite optimization

In this paper, we propose a framework of Inexact Proximal Stochastic Second-order (IPSS) methods for solving nonconvex optimization problems, whose objective function consists of an average of finitely many, possibly weakly, smooth functions and a convex but possibly nons- mooth function. At each iteration, IPSS inexactly solves a proximal subproblem constructed by using some positive … Read more

A linearly convergent stochastic recursive gradient method for convex optimization

The stochastic recursive gradient algorithm (SARAH) [8] attracts much interest recently. It admits a simple recursive framework for updating stochastic gradient estimates. Motivated by this, in this paper, we propose a SARAH-I method incorporating importance sampling, whose linear conver- gence rate of the sequence of distances between iterates and the optima set is proven under … Read more

Generalized Stochastic Frank-Wolfe Algorithm with Stochastic “Substitute” Gradient for Structured Convex Optimization

The stochastic Frank-Wolfe method has recently attracted much general interest in the context of optimization for statistical and machine learning due to its ability to work with a more general feasible region. However, there has been a complexity gap in the guaranteed convergence rate for stochastic Frank-Wolfe compared to its deterministic counterpart. In this work, … Read more