Bolstering Stochastic Gradient Descent with Model Building

Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are fine-tuned for the application at hand. Although this tuning process can require large computational costs, recent work has shown that these costs can be … Read more

Constrained Optimization in the Presence of Noise

The problem of interest is the minimization of a nonlinear function subject to nonlinear equality constraints using a sequential quadratic programming (SQP) method. The minimization must be performed while observing only noisy evaluations of the objective and constraint functions. In order to obtain stability, the classical SQP method is modified by relaxing the standard Armijo … Read more

On the Numerical Performance of Derivative-Free Optimization Methods Based on Finite-Difference Approximations

The goal of this paper is to investigate an approach for derivative-free optimization that has not received sufficient attention in the literature and is yet one of the simplest to implement and parallelize. It consists of computing gradients of a smoothed approximation of the objective function (and constraints), and employing them within established codes. These … Read more

An Inexact Successive Quadratic Approximation Method for Convex L-1 Regularized Optimization

We study a Newton-like method for the minimization of an objective function $\phi$ that is the sum of a smooth convex function and an $\ell_1$ regularization term. This method, which is sometimes referred to in the literature as a proximal Newton method, computes a step by minimizing a piecewise quadratic model $q_k$ of the objective … Read more

A Family of Second-Order Methods for Convex L1-Regularized Optimization

This paper is concerned with the minimization of an objective that is the sum of a convex function $f$ and an $\ell_1$ regularization term. Our interest is in methods that incorporate second-order information about the function $f$ to accelerate convergence. We describe a semi-smooth Newton framework that can be used to generate a variety of … Read more

Newton-Like Methods for Sparse Inverse Covariance Estimation

We propose two classes of second-order optimization methods for solving the sparse inverse covariance estimation problem. The first approach, which we call the Newton-LASSO method, minimizes a piecewise quadratic model of the objective function at every iteration to generate a step. We employ the fast iterative shrinkage thresholding method (FISTA) to solve this subproblem. The … Read more