Balancing Communication and Computation in Gradient Tracking Algorithms for Decentralized Optimization

Gradient tracking methods have emerged as one of the most popular approaches for solving decentralized optimization problems over networks. In this setting, each node in the network has a portion of the global objective function, and the goal is to collectively optimize this function. At every iteration, gradient tracking methods perform two operations (steps): (1) … Read more

The Online Shortest Path Problem: Learning Travel Times Using A Multi-Armed Bandit Framework

In the age of e-commerce, many logistic companies must operate in large road networks without accurate knowledge of travel times for their specific fleet of vehicles. Moreover, millions of dollars are spent on routing services that do not accurately capture the specific characteristics of the companies’ drivers and the types of vehicles they must use. … Read more

Mixed-Integer Quadratic Optimization and Iterative Clustering Techniques for Semi-Supervised Support Vector Machines

Among the most famous algorithms for solving classification problems are support vector machines (SVMs), which find a separating hyperplane for a set of labeled data points. In some applications, however, labels are only available for a subset of points. Furthermore, this subset can be non-representative, e.g., due to self-selection in a survey. Semi-supervised SVMs tackle … Read more

On the Optimization Landscape of Burer-Monteiro Factorization: When do Global Solutions Correspond to Ground Truth?

In low-rank matrix recovery, the goal is to recover a low-rank matrix, given a limited number of linear and possibly noisy measurements. Low-rank matrix recovery is typically solved via a nonconvex method called Burer-Monteiro factorization (BM). If the rank of the ground truth is known, BM is free of sub-optimal local solutions, and its true solutions … Read more

Variable Selection for Kernel Two-Sample Tests

We consider the variable selection problem for two-sample tests, aiming to select the most informative features to best distinguish samples from two groups. We propose a kernel maximum mean discrepancy (MMD) framework to solve this problem and further derive its equivalent mixed-integer programming formulations for linear, quadratic, and Gaussian types of kernel functions. Our proposed … Read more

Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree

Article Download View Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree

Multilevel Objective-Function-Free Optimization with an Application to Neural Networks Training

A class of multi-level algorithms for unconstrained nonlinear optimization is presented which does not require the evaluation of the objective function. The class contains the momentum-less AdaGrad method as a particular (single-level) instance. The choice of avoiding the evaluation of the objective function is intended to make the algorithms of the class less sensitive to … Read more

A classification method based on a cloud of spheres

\(\) In this article we propose a binary classification model to distinguish a specific class that corresponds to a characteristic that we intend to identify (fraud, spam, disease). The classification model is based on a cloud of spheres that circumscribe the points of the class to be identified. It is intended to build a model … Read more

Analyzing Inexact Hypergradients for Bilevel Learning

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that … Read more

A Levenberg-Marquardt Method for Nonsmooth Regularized Least Squares

\(\) We develop a Levenberg-Marquardt method for minimizing the sum of a smooth nonlinear least-squares term \(f(x) = \frac{1}{2} \|F(x)\|_2^2\) and a nonsmooth term \(h\). Both \(f\) and \(h\) may be nonconvex. Steps are computed by minimizing the sum of a regularized linear least-squares model and a model of \(h\) using a first-order method such … Read more