Mixed Integer Second-Order Cone Programming Formulations for Variable Selection

This paper concerns the method of selecting the best subset of explanatory variables in a multiple linear regression model. To evaluate a subset regression model, some goodness-of-fit measures, e.g., adjusted R^2, AIC and BIC, are generally employed. Although variable selection is usually handled via a stepwise regression method, the method does not always provide the best subset of explanatory variables according to adjusted R^2, AIC and BIC. In this paper, we propose mixed integer second-order cone programming formulations for selecting the best subset of variables. Computational experiments show that, in terms of the goodness-of-fit measures, the proposed formulations yield solutions having a clear advantage over common stepwise regression methods.

Citation

Published as: R. Miyashiro and Y. Takano, Mixed integer second-order cone programming formulations for variable selection in linear regression. European Journal of Operational Research, 247(3), pp. 721-731, 2015.