Optimal Cross-Validation for Sparse Linear Regression

Given a high-dimensional covariate matrix and a response vector, ridge-regularized sparse linear regression selects a subset of features that explains the relationship between covariates and the response in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like k-fold cross-validation are commonly used for hyperparameter tuning. However, cross-validation substantially increases the computational cost of sparse regression as it requires solving many mixed-integer optimization problems (MIOs) for each hyperparameter combination. To improve upon this state of affairs, we obtain computationally tractable relaxations of k-fold cross-validation metrics, facilitating hyperparameter selection after solving 50–80% fewer MIOs in practice. These relaxations result in an efficient cyclic coordinate descent scheme, achieving 10%–30% lower validation errors than via traditional methods such as grid search with MCP or GLMNet across a suite of 13 real-world datasets.

Article

Download

View PDF