Sparse PCA seeks approximate sparse “eigenvectors” whose projections capture the maximal variance of data. As a cardinality-constrained and non-convex optimization problem, it is NP-hard and yet it is encountered in a wide range of applied fields, from bio-informatics to finance. Recent progress has focused mainly on continuous approximation and convex relaxation of the hard cardinality constraint. In contrast, we consider an alternative discrete spectral formulation based on variational eigenvalue bounds and provide an effective greedy strategy as well as provably optimal solutions using branch-and-bound search. Moreover, the exact methodology used reveals a simple renormalization step that improves approximate solutions obtained by any continuous method. The resulting performance gain of discrete algorithms is demonstrated on real-world benchmark data and in extensive Monte Carlo evaluation trials.
Citation
Advances in Neural Information Processing Systems (NIPS), Vol. 18, pps. 915-922, MIT Press, Cambridge MA, 2006