Exploiting Prior Function Evaluations in Derivative-Free Optimization

A derivative-free optimization (DFO) algorithm is presented. The distinguishing feature of the algorithm is that it allows for the use of function values that have been made available through prior runs of a DFO algorithm for solving prior related optimization problems. Applications in which sequences of related optimization problems are solved such that the proposed … Read more

A New Face Algorithm Using LU Factorization for Linear Programming

The unique feature of the face algorithm \cite{pan14} is that it moves from face to face, rather than from vertex to vertex as the simplex algorithm. It uses the orthogonal projection of the negative objective gradient on the related null space as its search direction. Nevertheless, the algorithm is based on QR factorization, which would … Read more

A New Face Method for Linear Programming

An attractive feature of the face method \cite{pan14} for solving LP problems is that it uses the orthogonal projection of the negative objective gradient on the related null space as the search direction. However, the method would not be amenable for solving large sparse problems, since it handles the involved normal system by orthogonal transformations. … Read more

A derivative-free Gauss-Newton method

We present DFO-GN, a derivative-free version of the Gauss-Newton method for solving nonlinear least-squares problems. As is common in derivative-free optimization, DFO-GN uses interpolation of function values to build a model of the objective, which is then used within a trust-region framework to give a globally-convergent algorithm requiring $O(\epsilon^{-2})$ iterations to reach approximate first-order criticality … Read more

A semi-analytical approach for the positive semidefinite Procrustes problem

The positive semidefinite Procrustes (PSDP) problem is the following: given rectangular matrices $X$ and $B$, find the symmetric positive semidefinite matrix $A$ that minimizes the Frobenius norm of $AX-B$. No general procedure is known that gives an exact solution. In this paper, we present a semi-analytical approach to solve the PSDP problem. First, we characterize … Read more

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Lojasiewicz (PL) … Read more

Sparse optimization for inverse problems in atmospheric modelling

We consider inverse problems in atmospheric modelling. Instead of using the ordinary least squares, we add a weighting matrix based on the topology of measurement points and show the connection with Bayesian modelling. Since the source–receptor sensitivity matrix is usually ill-conditioned, the problem is often regularized, either by perturbing the objective function or by modifying … Read more

A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem

We consider solving the $\ell_1$-regularized least-squares ($\ell_1$-LS) problem in the context of sparse recovery, for applications such as compressed sensing. The standard proximal gradient method, also known as iterative soft-thresholding when applied to this problem, has low computational cost per iteration but a rather slow convergence rate. Nevertheless, when the solution is sparse, it often … Read more

On the evaluation complexity of cubic regularization methods for potentially rank-deficient nonlinear least-squares problems and its relevance to constrained nonlinear optimization

We propose a new termination criteria suitable for potentially singular, zero or non-zero residual, least-squares problems, with which cubic regularization variants take at most $\mathcal{O}(\epsilon^{-3/2})$ residual- and Jacobian-evaluations to drive either the Euclidean norm of the residual or its gradient below $\epsilon$; this is the best-known bound for potentially singular nonlinear least-squares problems. We then … Read more

Models and Algorithms for Distributionally Robust Least Squares Problems

We present different robust frameworks using probabilistic ambiguity descriptions of the input data in the least squares problems. The three probability ambiguity descriptions are given by: (1) confidence interval over the first two moments; (2) bounds on the probability measure with moments constraints; (3) confidence interval over the probability measure by using the Kantorovich probability … Read more