Optimal distance separating halfspace

One recently proposed criterion to separate two datasets in discriminant analysis, is to use a hyperplane which minimises the sum of distances to it from all the misclassified data points. Here all distances are supposed to be measured by way of some fixed norm,while misclassification means lying on the wrong side of the hyperplane, or rather in the wrong halfspace. In this paper we study the problem of determining such an optimal halfspace. In dimension $d$, we prove that there always exists an optimal separating halfspace passing through $d$ affinely independent data points. This directly shows that the problem is polynomially solvable in fixed dimension by an algorithm of $O(n^{d+1})$. If a different norm or gauge is used for each dataset in order to measure distances to the hyperplane, or if all distances are measured by a fixed (asymmetric) gauge, then one can still show that there always exists an optimal separating halfspace passing through $d-1$ affinely independent data points. The one-dimensional problem is extremely easy to solve: it suffices to find a balancing separating point, i.e. yielding an equal number (or weight) of misclassifieds for each dataset. It also follows that in any dimension any optimal separating halfspace always balances the misclassified points, where the balancing criterion now takes the shape of the used gauges into account.

Citation

Working paper BEIF/124, sept 2002, 9p, Vrije Universiteit Brussel.

Article

Download

View PDF