Block cubic Newton with greedy selection

\(\) A second-order block coordinate descent method is proposed for the unconstrained minimization of an objective function with Lipschitz continuous Hessian. At each iteration, a block of variables is selected by means of a greedy (Gauss-Southwell) rule which considers the amount of first-order stationarity violation, then an approximate minimizer of a cubic model is computed … Read more

An Adaptive Proximal ADMM for Nonconvex Linearly-Constrained Composite Programs

This paper develops an adaptive Proximal Alternating Direction Method of Multipliers (P-ADMM) for solving linearly-constrained, weakly convex, composite optimization problems. This method is adaptive to all problem parameters, including smoothness and weak convexity constants. It is assumed that the smooth component of the objective is weakly convex and possibly nonseparable, while the non-smooth component is … Read more

Black-box Optimization Algorithms for Regularized Least-squares Problems

We consider the problem of optimizing the sum of a smooth, nonconvex function for which derivatives are unavailable, and a convex, nonsmooth function with easy-to-evaluate proximal operator. Of particular focus is the case where the smooth part has a nonlinear least-squares structure. We adapt two existing approaches for derivative-free optimization of nonsmooth compositions of smooth … Read more

On the strength of Burer’s lifted convex relaxation to quadratic programming with ball constraints

We study quadratic programs with m ball constraints, and the strength of a lifted convex relaxation for it recently proposed by Burer (2024). Burer shows this relaxation is exact when m=2. For general m, Burer (2024) provides numerical evidence that this lifted relaxation is tighter than the Kronecker product based Reformulation Linearization Technique (RLT) inequalities … Read more

Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks

In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width. We present empirical evidence that our theoretically founded regularized gradient clipping algorithm is also competitive with the state-of-the-art … Read more

Predictive Low Rank Matrix Learning under Partial Observations: Mixed-Projection ADMM

\(\) We study the problem of learning a partially observed matrix under the low rank assumption in the presence of fully observed side information that depends linearly on the true underlying matrix. This problem consists of an important generalization of the Matrix Completion problem, a central problem in Statistics, Operations Research and Machine Learning, that … Read more

Complexity results and active-set identification of a derivative-free method for bound-constrained problems

In this paper, we analyze a derivative-free line search method designed for bound-constrained problems. Our analysis demonstrates that this method exhibits a worst-case complexity comparable to other derivative-free methods for unconstrained and linearly constrained problems. In particular, when minimizing a function with $n$ variables, we prove that at most ${\cal O}(n\epsilon^{-2})$ iterations are needed to … Read more

Routing a fleet of unmanned aerial vehicles: a trajectory optimisation-based framework

We consider an aerial survey operation in which a fleet of unmanned aerial vehicles (UAVs) is required to visit several locations and then land in one of the available landing sites while optimising some performance criteria, subject to operational constraints and flight dynamics. We aim to minimise the maximum flight time of the UAVs. To … Read more

A Stochastic Objective-Function-Free Adaptive Regularization Method with Optimal Complexity

\(\) A fully stochastic second-order adaptive-regularization method for unconstrained nonconvex optimization is presented which never computes the objective-function value, but yet achieves the optimal $\mathcal{O}(\epsilon^{-3/2})$ complexity bound for finding first-order critical points. The method is noise-tolerant and the inexactness conditions required for convergence depend on the history of past steps. Applications to cases where derivative … Read more