Conjecturing-Based Discovery of Patterns in Data

We propose the use of a conjecturing machine that suggests feature relationships in the form of bounds involving nonlinear terms for numerical features and boolean expressions for categorical features. The proposed Conjecturing framework recovers known nonlinear and boolean relationships among features from data. In both settings, true underlying relationships are revealed. We then compare the … Read more

Approximating L1-Norm Best-Fit Lines

Sufficient conditions are provided for a deterministic algorithm for estimating an L1-norm best-fit one-dimensional subspace. To prove the conditions are sufficient, fundamental properties of the L1-norm projection of a point onto a one-dimensional subspace are derived. Also, an equivalence is established between the algorithm, which involves the calculation of several weighted medians, and independently-derived algorithms … Read more

Estimating L1-Norm Best-Fit Lines for Data

The general formulation for finding the L1-norm best-fit subspace for a point set in $m$-dimensions is a nonlinear, nonconvex, nonsmooth optimization problem. In this paper we present a procedure to estimate the L1-norm best-fit one-dimensional subspace (a line through the origin) to data in $\Re^m$ based on an optimization criterion involving linear programming but which … Read more

pcaL1: An Implementation in R of Three Methods for L1-Norm Principal Component Analysis

pcaL1 is a package for the R environment for finding principal components using methods based on the L1 norm. The principal components derived using traditional principal component analysis (PCA) can be interpreted as an optimal solution to several optimization problems involving the L2 norm. Using the L1 norm in these problems provides an alternative that … Read more

Coverings and Matchings in r-Partite Hypergraphs

Ryser’s conjecture postulates that, for $r$-partite hypergraphs, $\tau \leq (r-1) \nu$ where $\tau$ is the covering number of the hypergraph and $\nu$ is the matching number. Although this conjecture has been open since the 1960s, researchers have resolved it for special cases such as for intersecting hypergraphs where $r \leq 5$. In this paper, we … Read more

A Pure L1-norm Principal Component Analysis

The L1 norm has been applied in numerous variations of principal component analysis (PCA). L1-norm PCA is an attractive alternative to traditional L2-based PCA because it can impart robustness in the presence of outliers and is indicated for models where standard Gaussian assumptions about the noise may not apply. Of all the previously-proposed PCA schemes … Read more

The L1-Norm Best-Fit Hyperplane Problem

We formalize an algorithm for solving the L1-norm best-fit hyperplane problem derived using first principles and geometric insights about L1 projection and L1 regression. The procedure follows from a new proof of global optimality and relies on the solution of a small number of linear programs. The procedure is implemented for validation and testing. This … Read more

Support vector machines with the ramp loss and the hard margin loss

In the interest of deriving classifiers that are robust to outlier observations, we present integer programming formulations of Vapnik’s support vector machine (SVM) with the ramp loss and hard margin loss. The ramp loss allows a maximum error of 2 for each training observation, while the hard margin loss calculates error by counting the number … Read more

Automated Tuning of Optimization Software Parameters

We present a method to tune software parameters using ideas from software testing and machine learning. The method is based on the key observation that for many classes of instances, the software shows improved performance if a few critical parameters have “good” values, although which parameters are critical depends on the class of instances. Our … Read more