Accelerated Gradient Descent via Long Steps

Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps periodically, gradient descent’s state-of-the-art O(1/T) convergence guarantees can be improved by constant factors, conjecturing an accelerated rate strictly faster than O(1/T) could be possible. Here we prove such a big-O gain, establishing gradient descent’s first accelerated convergence rate in this setting. Namely, we … Read more

Using orthogonally structured positive bases for constructing positive k-spanning sets with cosine measure guarantees

\(\) Positive spanning sets span a given vector space by nonnegative linear combinations of their elements. These have attracted significant attention in recent years, owing to their extensive use in derivative-free optimization. In this setting, the quality of a positive spanning set is assessed through its cosine measure, a geometric quantity that expresses how well … Read more

Limited memory gradient methods for unconstrained optimization

The limited memory steepest descent method (Fletcher, 2012) for unconstrained optimization problems stores a few past gradients to compute multiple stepsizes at once. We review this method and propose new variants. For strictly convex quadratic objective functions, we study the numerical behavior of different techniques to compute new stepsizes. In particular, we introduce a method … Read more

Expected decrease for derivative-free algorithms using random subspaces

Derivative-free algorithms seek the minimum of a given function based only on function values queried at appropriate points. Although these methods are widely used in practice, their performance is known to worsen as the problem dimension increases. Recent advances in developing randomized derivative-free techniques have tackled this issue by working in low-dimensional subspaces that are … Read more

On Common-Random-Numbers and the Complexity of Adaptive Sampling Trust-Region Methods

\(\) In the context of simulation optimization (SO), Common Random Numbers (CRN) is the practice of querying the simulation-based oracle with the same random number stream at each point visited by an SO algorithm. This practice is widely believed to facilitate SO algorithm efficiency by preserving structure inherent to the objective function and gradient sample-paths. … Read more

Regularized methods via cubic subspace minimization for nonconvex optimization

\(\) The main computational cost per iteration of adaptive cubic regularization methods for solving large-scale nonconvex problems is the computation of the step \(s_k\), which requires an approximate minimizer of the cubic model. We propose a new approach in which this minimizer is sought in a low dimensional subspace that, in contrast to classical approaches, … Read more

Non-asymptotic superlinear convergence of Nesterov accelerated BFGS

In this paper, we derive the explicit finite time local convergence of Nesterov accelerated the Broyden-Fletcher-Goldfarb-Shanno (NA-BFGS) under the assumption that the objective function is strongly convex, its gradient is Lipschitz continuous, and its Hessian is Lipschitz continuous at the optimal point. We have shown that the rate of convergence of the NA-BFGS method is … Read more

An Explicit Three-Term Polak-Ribière-Polyak Conjugate Gradient Method for Bicriteria Optimization

We propose in this paper a Polak-Ribière-Polyak conjugate gradient type method for solving bicriteria optimization problems by avoiding scalarization techniques. Two particular advantages in this contribution are to be noted. First, the suggested descent direction common to both criteria may be directly computed by a given formula without solving any intermediate subproblem. Second, the descent … Read more

Yet another fast variant of Newton’s method for nonconvex optimization

\(\) A second-order algorithm is proposed for minimizing smooth nonconvex functions that alternates between regularized Newton and negative curvature steps. In most cases, the Hessian matrix is regularized with the square root of the current gradient and an additional term taking moderate negative curvature into account, a negative curvature step being taken only exceptionnally. As … Read more

Inexact reduced gradient methods in nonconvex optimization

This paper proposes and develops new linesearch methods with inexact gradient information for finding stationary points of nonconvex continuously differentiable functions on finite-dimensional spaces. Some abstract convergence results for a broad class of linesearch methods are established. A general scheme for inexact reduced gradient (IRG) methods is proposed, where the errors in the gradient approximation … Read more