D-optimal Data Fusion: Exact and Approximation Algorithms

We study the D-optimal Data Fusion (DDF) problem, which aims to select new data points, given an existing Fisher information matrix, so as to maximize the logarithm of the determinant of the overall Fisher information matrix. We show that the DDF problem is NP-hard and has no constant-factor polynomial-time approximation algorithm unless P = NP. … Read more

Beyond Symmetry: Best Submatrix Selection for the Sparse Truncated SVD

Truncated singular value decomposition (SVD), also known as the best low-rank matrix approximation, has been successfully applied to many domains such as biology, healthcare, and others, where high-dimensional datasets are prevalent. To enhance the interpretability of the truncated SVD, sparse SVD (SSVD) is introduced to select a few rows and columns of the original matrix … Read more

Exact and Approximation Algorithms for Sparse PCA

Sparse PCA (SPCA) is a fundamental model in machine learning and data analytics, which has witnessed a variety of application areas such as finance, manufacturing, biology, healthcare. To select a prespecified-size principal submatrix from a covariance matrix to maximize its largest eigenvalue for the better interpretability purpose, SPCA advances the conventional PCA with both feature … Read more

Best Principal Submatrix Selection for the Maximum Entropy Sampling Problem: Scalable Algorithms and Performance Guarantees

This paper studies a classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix with a given size out of a covariance matrix from a system. MESP has been widely applied to many areas, including healthcare, power system, manufacturing, data science, etc. Investigating its Lagrangian dual and primal characterization, we … Read more