A Robust Multi-Batch L-BFGS Method for Machine Learning

This paper describes an implementation of the L-BFGS method designed to deal with two adversarial situations. The first occurs in distributed computing environments where some of the computational nodes devoted to the evaluation of the function and gradient are unable to return results on time. A similar challenge occurs in a multi-batch approach in which … Read more

Robust Stochastic Optimization Made Easy with RSOME

We present a new distributionally robust optimization model called robust stochastic optimization (RSO), which unifies both scenario-tree based stochastic linear optimization and distributionally robust optimization in a practicable framework that can be solved using the state-of-the-art commercial optimization solvers. We also develop a new algebraic modeling package, RSOME to facilitate the implementation of RSO models. … Read more

An Investigation of Newton-Sketch and Subsampled Newton Methods

Sketching, a dimensionality reduction technique, has received much attention in the statistics community. In this paper, we study sketching in the context of Newton’s method for solving finite-sum optimization problems in which the number of variables and data points are both large. We study two forms of sketching that perform dimensionality reduction in data space: … Read more

Random Sampling and Machine Learning to Understand Good Decompositions

Motivated by its implications in the development of general purpose solvers for decomposable Mixed Integer Programs (MIP), we address a fundamental research question, that is to assess if good decomposition patterns can be consistently found by looking only at static properties of MIP input instances, or not. We adopt a data driven approach, devising a … Read more

RSG: Beating Subgradient Method without Smoothness and Strong Convexity

In this paper, we study the efficiency of a {\bf R}estarted {\bf S}ub{\bf G}radient (RSG) method that periodically restarts the standard subgradient method (SG). We show that, when applied to a broad class of convex optimization problems, RSG method can find an $\epsilon$-optimal solution with a low complexity than SG method. In particular, we first … Read more

Exact and Inexact Subsampled Newton Methods for Optimization

The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how to coordinate the accuracy in the gradient and Hessian to yield a superlinear rate of convergence in expectation. The second part of … Read more

Max-Norm Optimization for Robust Matrix Recovery

This paper studies the matrix completion problem under arbitrary sampling schemes. We propose a new estimator incorporating both max-norm and nuclear-norm regularization, based on which we can conduct efficient low-rank matrix recovery using a random subset of entries observed with additive noise under general non-uniform and unknown sampling distributions. This method significantly relaxes the uniform … Read more

A SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning

Machine learning theory typically assumes that training data is unbiased and not adversarially generated. When real training data deviates from these assumptions, trained models make erroneous predictions, sometimes with disastrous effects. Robust losses, such as the huber norm are designed to mitigate the effects of such contaminated data, but they are limited to the regression … Read more

Second-order optimality and beyond: characterization and evaluation complexity in convexly-constrained nonlinear optimization

High-order optimality conditions for convexly-constrained nonlinear optimization problems are analyzed. A corresponding (expensive) measure of criticality for arbitrary order is proposed and extended to define high-order $\epsilon$-approximate critical points. This new measure is then used within a conceptual trust-region algorithm to show that, if derivatives of the objective function up to order $q \geq 1$ … Read more

Convex Variational Formulations for Learning Problems

Abstract—In this article, we introduce new techniques to solve the nonlinear regression problem and the nonlinear classification problem. Our benchmarks suggest that our method for regression is significantly more effective when compared to classical methods and our method for classification is competitive. Our list of classical methods includes least squares, random forests, decision trees, boosted … Read more