Single-neuron convexifications for binarized neural networks

Binarized neural networks are an important class of neural network in deep learning due to their computational efficiency. This paper contributes towards a better understanding of the structure of binarized neural networks, specifically, ideal convex representations of the activation functions used. We describe the convex hull of the graph of the signum activation function associated … Read more

Sums of Separable and Quadratic Polynomials

We study separable plus quadratic (SPQ) polynomials, i.e., polynomials that are the sum of univariate polynomials in different variables and a quadratic polynomial. Motivated by the fact that nonnegative separable and nonnegative quadratic polynomials are sums of squares, we study whether nonnegative SPQ polynomials are (i) the sum of a nonnegative separable and a nonnegative … Read more

A Unifying Framework for Sparsity Constrained Optimization

In this paper, we consider the optimization problem of minimizing a continuously differentiable function subject to both convex constraints and sparsity constraints. By exploiting a mixed-integer reformulation from the literature, we define a necessary optimality condition based on a tailored neighborhood that allows to take into account potential changes of the support set. We then … Read more

Branch-and-bound Algorithm for Optimal Sparse Canonical Correlation Analysis

Canonical correlation analysis (CCA) is a family of multivariate statistical methods for extracting mutual information contained in multiple datasets. To improve the interpretability of CCA, here we focus on the mixed-integer optimization (MIO) approach to sparse estimation. This approach was first proposed for sparse linear regression in the 1970s, but it has recently received renewed … Read more

Implicit Regularization of Sub-Gradient Method in Robust Matrix Recovery: Don’t be Afraid of Outliers

It is well-known that simple short-sighted algorithms, such as gradient descent, generalize well in the over-parameterized learning tasks, due to their implicit regularization. However, it is unknown whether the implicit regularization of these algorithms can be extended to robust learning tasks, where a subset of samples may be grossly corrupted with noise. In this work, … Read more

Scalable Inference of Sparsely-changing Markov Random Fields with Strong Statistical Guarantees

In this paper, we study the problem of inferring time-varying Markov random fields (MRF), where the underlying graphical model is both sparse and changes sparsely over time. Most of the existing methods for the inference of time-varying MRFs rely on the regularized maximum likelihood estimation (MLE), that typically suffer from weak statistical guarantees and high … Read more

Strong Optimal Classification Trees

Decision trees are among the most popular machine learning models and are used routinely in applications ranging from revenue management and medicine to bioinformatics. In this paper, we consider the problem of learning optimal binary classification trees with univariate splits. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality … Read more

Kernel Distributionally Robust Optimization

We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, including sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization … Read more

An Alternating Method for Cardinality-Constrained Optimization: A Computational Study for the Best Subset Selection and Sparse Portfolio Problems

Cardinality-constrained optimization problems are notoriously hard to solve both in theory and practice. However, as famous examples such as the sparse portfolio optimization and best subset selection problems show, this class is extremely important in real-world applications. In this paper, we apply a penalty alternating direction method to these problems. The key idea is to … Read more

Sparse Poisson regression via mixed-integer optimization

We present a mixed-integer optimization (MIO) approach to sparse Poisson regression. The MIO approach to sparse linear regression was first proposed in the 1970s, but has recently received renewed attention due to advances in optimization algorithms and computer hardware. In contrast to many sparse estimation algorithms, the MIO approach has the advantage of finding the … Read more