Tractable Robust Supervised Learning Models

At the heart of supervised learning is a minimization problem with an objective function that evaluates a set of training data over a loss function that penalizes poor fitting and a regularization function that penalizes over-fitting to the training data. More recently, data-driven robust optimization based learning models provide an intuitive robustness perspective of regularization. … Read more

Ellipsoidal Classification via Semidefinite Programming

Separating two finite sets of points in a Euclidean space is a fundamental problem in classification. Customarily linear separation is used, but nonlinear separators such as spheres have been shown to have better performances in some tasks, such as edge detection in images. We exploit the relationships between the more general version of the spherical … Read more

A Classifier to Decide on the Linearization of Mixed-Integer Quadratic Problems in CPLEX

We translate the algorithmic question of whether to linearize convex Mixed-Integer Quadratic Programming problems (MIQPs) into a classification task, and use machine learning (ML) techniques to tackle it. We represent MIQPs and the linearization decision by careful target and feature engineering. Computational experiments and evaluation metrics are designed to further incorporate the optimization knowledge in … Read more

Convex Variational Formulations for Learning Problems

Abstract—In this article, we introduce new techniques to solve the nonlinear regression problem and the nonlinear classification problem. Our benchmarks suggest that our method for regression is significantly more effective when compared to classical methods and our method for classification is competitive. Our list of classical methods includes least squares, random forests, decision trees, boosted … Read more

A Distributionally-robust Approach for Finding Support Vector Machines

The classical SVM is an optimization problem minimizing the hinge losses of mis-classified samples with the regularization term. When the sample size is small or data has noise, it is possible that the classifier obtained with training data may not generalize well to pop- ulation, since the samples may not accurately represent the true population … Read more

Conic separation of finite sets:The homogeneous case

This work addresses the issue of separating two finite sets in $\mathbb{R}^n $ by means of a suitable revolution cone $$ \Gamma (z,y,s)= \{x \in \mathbb{R}^n : s\,\Vert x-z\Vert – y^T(x-z)=0\}.$$ The specific challenge at hand is to determine the aperture coefficient $s$, the axis $y$, and the apex $z$ of the cone. These parameters … Read more

Conic separation of finite sets: The non-homogeneous case

We address the issue of separating two finite sets in $\mathbb{R}^n $ by means of a suitable revolution cone $$ \Gamma (z,y,s)= \{x \in \mathbb{R}^n :\, s\,\Vert x-z\Vert – y^T(x-z)=0\}.$$ One has to select the aperture coefficient $s$, the axis $y$, and the apex $z$ in such a way as to meet certain optimal separation … Read more

Tightened L0 Relaxation Penalties for Classification

In optimization-based classification model selection, for example when using linear programming formulations, a standard approach is to penalize the L1 norm of some linear functional in order to select sparse models. Instead, we propose a novel integer linear program for sparse classifier selection, generalizing the minimum disagreement hyperplane problem whose complexity has been investigated in … Read more

Support vector machines with the ramp loss and the hard margin loss

In the interest of deriving classifiers that are robust to outlier observations, we present integer programming formulations of Vapnik’s support vector machine (SVM) with the ramp loss and hard margin loss. The ramp loss allows a maximum error of 2 for each training observation, while the hard margin loss calculates error by counting the number … Read more

General algorithmic frameworks for online problems

We study general algorithmic frameworks for online learning tasks. These include binary classification, regression, multiclass problems and cost-sensitive multiclass classification. The theorems that we present give loss bounds on the behavior of our algorithms that depend on general conditions on the iterative step sizes. Citation International Journal of Pure and Applied Mathematics, Vol. 46 (2008), … Read more