An approximation algorithm for optimal piecewise linear approximations of bounded variable products

We investigate the optimal piecewise linear interpolation of the bivariate product xy over rectangular domains. More precisely, our aim is to minimize the number of simplices in the triangulation underlying the interpolation, while respecting a prescribed approximation error. First, we show how to construct optimal triangulations consisting of up to five simplices. Using these as … Read more

Convergence properties of an Objective-Function-Free Optimization regularization algorithm, including an $\mathcal{O}(\epsilon^{-3/2})$ complexity bound

An adaptive regularization algorithm for unconstrained nonconvex optimization is presented in which the objective function is never evaluated, but only derivatives are used. This algorithm belongs to the class of adaptive regularization methods, for which optimal worst-case complexity results are known for the standard framework where the objective function is evaluated. It is shown in … Read more

Riemannian Stochastic Proximal Gradient Methods for Nonsmooth Optimization over the Stiefel Manifold

Riemannian optimization has drawn a lot of attention due to its wide applications in practice. Riemannian stochastic first-order algorithms have been studied in the literature to solve large-scale machine learning problems over Riemannian manifolds. However, most of the existing Riemannian stochastic algorithms require the objective function to be differentiable, and they do not apply to … Read more

The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks

Neural networks tend to achieve better accuracy with training if they are larger — even if the resulting models are overparameterized. Nevertheless, carefully removing such excess parameters before, during, or after training may also produce models with similar or even improved accuracy. In many cases, that can be curiously achieved by heuristics as simple as … Read more

Limits of eventual families of sets with application to algorithms for the common fixed point problem

We present an abstract framework for asymptotic analysis of convergence based on the notions of eventual families of sets that we define. A family of subsets of a given set is called here an “eventual family” if it is upper hereditary with respect to inclusion. We define accumulation points of eventual families in a Hausdorff … Read more

Shortest Path Network Interdiction with Asymmetric Uncertainty

This paper considers an extension of the shortest path network interdiction problem that incorporates robustness to account for parameter uncertainty. The shortest path interdiction problem is a game of two players with conflicting agendas and capabilities: an evader, who traverses the arcs of a network from a source node to a sink node using the … Read more

OFFO minimization algorithms for second-order optimality and their complexity

An Adagrad-inspired class of algorithms for smooth unconstrained optimization is presented in which the objective function is never evaluated and yet the gradient norms decrease at least as fast as O(1/\sqrt{k+1}) while second-order optimality measures converge to zero at least as fast as O(1/(k+1)^{1/3}). This latter rate of convergence is shown to be essentially sharp … Read more

Branch-and-Bound Performance Estimation Programming: A Unified Methodology for Constructing Optimal Optimization Methods

We present the Branch-and-Bound Performance Estimation Programming (BnB-PEP), a unified methodology for constructing optimal first-order methods for convex and nonconvex optimization. BnB-PEP poses the problem of finding the optimal optimization method as a nonconvex but practically tractable quadratically constrained quadratic optimization problem and solves it to certifiable global optimality using a customized branch-and-bound algorithm. By … Read more

Parametric complexity analysis for a class of first-order Adagrad-like algorithms

A class of algorithms for optimization in the presence of noise is presented, that does not require the evaluation of the objective function. This class generalizes the well-known Adagrad method. The complexity of this class is then analyzed as a function of its parameters, and it is shown that some methods of the class enjoy … Read more

First-Order Objective-Function-Free Optimization Algorithms and Their Complexity

A class of algorithms for unconstrained nonconvex optimization is considered where the value of the objective function is never computed. The class contains a deterministic version of the first-order Adagrad method typically used for minimization of noisy function, but also allows the use of second-order information when available. The rate of convergence of methods in … Read more