The min-Knapsack Problem with Compactness Constraints and Applications in Statistics

In the min-Knapsack problem, one is given a set of items, each having a certain cost and weight. The objective is to select a subset with minimum cost, such that the sum of the weights is not smaller than a given constant. In this paper we introduce an extension of the min-Knapsack problem with additional ``compactness constraints'' (mKPC), stating that selected items cannot lie too far apart from each other. This extension has applications in algorithms for change-point detection in time series and for variable selection in high-dimensional statistics. We propose three solution methods for the mKPC. The first two methods use the same Mixed-Integer Programming (MIP) formulation, but with two different approaches: either passing the complete model with a quadratic number of constraints to a black-box MIP solver or dynamically separating the constraints using a branch-and-cut algorithm. Numerical experiments highlight the advantages of this dynamic separation. The third approach is a dynamic programming labelling algorithm. Finally, we focus on the special case of the unit-cost mKPC (1c-mKPC), which has a specific interpretation in the context of the statistical applications mentioned above. We prove that the 1c-mKPC is solvable in polynomial time with a different ad-hoc dynamic programming algorithm. Experimental results show that this algorithm vastly outperforms both generic approaches for the mKPC, as well as a simple greedy heuristic from the literature.

Citation

Preprint

Article

Download

View The min-Knapsack Problem with Compactness Constraints and Applications in Statistics