A Proximal-Gradient Method for Constrained Optimization

We present a new algorithm for solving optimization problems with objective functions that are the sum of a smooth function and a (potentially) nonsmooth regularization function, and nonlinear equality constraints. The algorithm may be viewed as an extension of the well-known proximal-gradient method that is applicable when constraints are not present. To account for nonlinear … Read more

Learning-to-Optimize with PAC-Bayesian Guarantees: Theoretical Considerations and Practical Implementation

We use the PAC-Bayesian theory for the setting of learning-to-optimize. To the best of our knowledge, we present the first framework to learn optimization algorithms with provable generalization guarantees (PAC-Bayesian bounds) and explicit trade-off between convergence guarantees and convergence speed, which contrasts with the typical worst-case analysis. Our learned optimization algorithms provably outperform related ones … Read more

Stochastic Aspects of Dynamical Low-Rank Approximation in the Context of Machine Learning

The central challenges of today’s neural network architectures are the prohibitive memory footprint and training costs associated with determining optimal weights and biases. A large portion of research in machine learning is therefore dedicated to constructing memory-efficient training methods. One promising approach is dynamical low-rank training (DLRT), which represents and trains parameters as a low-rank … Read more

Data Collaboration Analysis with Orthonormal Basis Selection and Alignment

Data Collaboration (DC) enables multiple parties to jointly train a model by sharing only linear projections of their private datasets. The core challenge in DC is to align the bases of these projections without revealing each party’s secret basis. While existing theory suggests that any target basis spanning the common subspace should suffice, in practice, … Read more

Robust support vector machines via conic optimization

We consider the problem of learning support vector machines robust to uncertainty. It has been established in the literature that typical loss functions, including the hinge loss, are sensible to data perturbations and outliers, thus performing poorly in the setting considered. In contrast, using the 0-1 loss or a suitable non-convex approximation results in robust … Read more

Data-Driven Reliable Facility Location Design

We study the reliable (uncapacitated) facility location (RFL) problem in a data-driven environment where historical observations of random demands and disruptions are available. Owing to the combinatorial optimization nature of the RFL problem and the mixed-binary randomness of parameters therein, the state-of-the-art RFL models applied to the data-driven setting either suggest overly conservative solutions, or … Read more

Mixed-Integer Linear Optimization for Semi-Supervised Optimal Classification Trees

Decision trees are one of the most famous methods for solving classification problems, mainly because of their good interpretability properties. Moreover, due to advances in recent years in mixed-integer optimization, several models have been proposed to formulate the problem of computing optimal classification trees. The goal is, given a set of labeled points, to split … Read more

Block Majorization Minimization with Extrapolation and Application to $\beta$-NMF

We propose a Block Majorization Minimization method with Extrapolation (BMMe) for solving a class of multi-convex optimization problems. The extrapolation parameters of BMMe are updated using a novel adaptive update rule. By showing that block majorization minimization can be reformulated as a block mirror descent method, with the Bregman divergence adaptively updated at each iteration, … Read more

Optimal counterfactual explanations for k-Nearest Neighbors using Mathematical Optimization and Constraint Programming

Within the topic of explainable AI, counterfactual explanations to classifiers have received significant recent attention. We study counterfactual explanations that try to explain why a data point received an undesirable classification by providing the closest data point that would have received a desirable one. Within the context of one the simplest and most popular classification … Read more

It’s All in the Mix: Wasserstein Classification and Regression with Mixed Features

Problem definition: A key challenge in supervised learning is data scarcity, which can cause prediction models to overfit to the training data and perform poorly out of sample. A contemporary approach to combat overfitting is offered by distributionally robust problem formulations that consider all data-generating distributions close to the empirical distribution derived from historical samples, … Read more