Tensor Methods for Minimizing Convex Functions with Hölder Continuous Higher-Order Derivatives

In this paper we study p-order methods for unconstrained minimization of convex functions that are p-times differentiable with $\nu$-Hölder continuous pth derivatives. We propose tensor schemes with and without acceleration. For the schemes without acceleration, we establish iteration complexity bounds of $\mathcal{O}\left(\epsilon^{-1/(p+\nu-1)}\right)$ for reducing the functional residual below a given $\epsilon\in (0,1)$. Assuming that $\nu$

Non-Stationary First-Order Primal-Dual Algorithms with Fast Convergence Rates

In this paper, we propose two novel non-stationary first-order primal-dual algorithms to solve nonsmooth composite convex optimization problems. Unlike existing primal-dual schemes where the parameters are often fixed, our methods use pre-defined and dynamic sequences for parameters. We prove that our first algorithm can achieve O(1/k) convergence rate on the primal-dual gap, and primal and

An analysis of noise folding for low-rank matrix recovery

Previous work regarding low-rank matrix recovery has concentrated on the scenarios in which the matrix is noise-free and the measurements are corrupted by noise. However, in practical application, the matrix itself is usually perturbed by random noise preceding to measurement. This paper concisely investigates this scenario and evidences that, for most measurement schemes utilized in

General risk measures for robust machine learning

A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we

A linearly convergent stochastic recursive gradient method for convex optimization

The stochastic recursive gradient algorithm (SARAH) [8] attracts much interest recently. It admits a simple recursive framework for updating stochastic gradient estimates. Motivated by this, in this paper, we propose a SARAH-I method incorporating importance sampling, whose linear conver- gence rate of the sequence of distances between iterates and the optima set is proven under

Relative-error inertial-relaxed inexact versions of Douglas-Rachford and ADMM splitting algorithms

This paper derives new inexact variants of the Douglas-Rachford splitting method for maximal monotone operators and the alternating direction method of multipliers (ADMM) for convex optimization. The analysis is based on a new inexact version of the proximal point algorithm that includes both an inertial step and overrelaxation. We apply our new inexact ADMM method

Low-rank matrix recovery with composite optimization: good conditioning and rapid convergence

The task of recovering a low-rank matrix from its noisy linear measurements plays a central role in computational science. Smooth formulations of the problem often exhibit an undesirable phenomenon: the condition number, classically defined, scales poorly with the dimension of the ambient space. In contrast, we here show that in a variety of concrete circumstances,

Trust-region methods for the derivative-free optimization of nonsmooth black-box functions

In this paper we study the minimization of a nonsmooth black-box type function, without assuming any access to derivatives or generalized derivatives and without any knowledge about the analytical origin of the function nonsmoothness. Directional methods have been derived for such problems but to our knowledge no model-based method like a trust-region one has yet

A Class of Stochastic Variance Reduced Methods with an Adaptive Stepsize

Stochastic variance reduced methods have recently surged into prominence for solving large scale optimization problems in the context of machine learning. Tan, Ma and Dai et al. first proposed the new stochastic variance reduced gradient (SVRG) method with the Barzilai-Borwein (BB) method to compute step sizes automatically, which performs well in practice. On this basis,

Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization

Backtracking line-search is an old yet powerful strategy for finding better step size to be used in proximal gradient algorithms. The main principle is to locally find a simple convex upper bound of the objective function, which in turn controls the step size that is used. In case of inertial proximal gradient algorithms, the situation