Yongchun Li – Optimization Online

On Sparse Canonical Correlation Analysis

Published: 2023/12/31

The classical Canonical Correlation Analysis (CCA) identifies the correlations between two sets of multivariate variables based on their covariance, which has been widely applied in diverse fields such as computer vision, natural language processing, and speech analysis. Despite its popularity, CCA can encounter challenges in explaining correlations between two variable sets within high-dimensional data contexts. … Read more

On the Partial Convexification of the Low-Rank Spectral Optimization: Rank Bounds and Algorithms

Published: 2023/05/12, Updated: 2023/06/21

Yongchun Li

Weijun Xie

Integer Programming column generation, Low-Rank Spectral Optimization, Partial Relaxation, Rank Bounds, rank reduction

A Low-rank Spectral Optimization Problem (LSOP) minimizes a linear objective subject to multiple two-sided linear matrix inequalities intersected with a low-rank and spectral constrained domain set. Although solving LSOP is, in general, NP-hard, its partial convexification (i.e., replacing the domain set by its convex hull) termed “LSOP-R”, is often tractable and yields a high-quality solution. … Read more

On the Exactness of Dantzig-Wolfe Relaxation for Rank Constrained Optimization Problems

Published: 2022/10/28, Updated: 2023/06/14

Yongchun Li

Weijun Xie

Integer Programming Convex Hull Exactness, Dantzig-Wolfe Relaxation, Extreme point Exactness, Fair Unsupervised Learning, Objective Exactness, qcqp, Rank Constraint

In the rank-constrained optimization problem (RCOP), it minimizes a linear objective function over a prespecified closed rank-constrained domain set and $m$ generic two-sided linear matrix inequalities. Motivated by the Dantzig-Wolfe (DW) decomposition, a popular approach of solving many nonconvex optimization problems, we investigate the strength of DW relaxation (DWR) of the RCOP, which admits the … Read more

D-optimal Data Fusion: Exact and Approximation Algorithms

Published: 2022/08/07, Updated: 2023/08/23

Combinatorial Optimization, Integer Programming approximation algorithm, d-optimality, Data fusion, exact algorithm, Fisher information matrix, maximum-entropy sampling, optimality cut, submodular inequality

We study the D-optimal Data Fusion (DDF) problem, which aims to select new data points, given an existing Fisher information matrix, so as to maximize the logarithm of the determinant of the overall Fisher information matrix. We show that the DDF problem is NP-hard and has no constant-factor polynomial-time approximation algorithm unless P = NP. … Read more

Beyond Symmetry: Best Submatrix Selection for the Sparse Truncated SVD

Published: 2021/05/07, Updated: 2022/06/01

Yongchun Li

Weijun Xie

Combinatorial Optimization, Data-Mining, Integer Programming approximation algorithms, misdp, sparse pca, sparse svd, truncated svd

Truncated singular value decomposition (SVD), also known as the best low-rank matrix approximation, has been successfully applied to many domains such as biology, healthcare, and others, where high-dimensional datasets are prevalent. To enhance the interpretability of the truncated SVD, sparse SVD (SSVD) is introduced to select a few rows and columns of the original matrix … Read more

Exact and Approximation Algorithms for Sparse PCA

Published: 2020/05/18, Updated: 2024/01/04

Yongchun Li

Weijun Xie

Integer Programming approximation algorithms, fairness, mixed-integer programming, semi-definite program, sparse pca, svd

Sparse Principal Component Analysis (SPCA) is designed to enhance the interpretability of traditional Principal Component Analysis (PCA) by optimally selecting a subset of features that comprise the first principal component. Given the NP-hard nature of SPCA, most current approaches resort to approximate solutions, typically achieved through tractable semidefinite programs (SDPs) or heuristic methods. To solve SPCA to … Read more

Best Principal Submatrix Selection for the Maximum Entropy Sampling Problem: Scalable Algorithms and Performance Guarantees

Published: 2020/01/23, Updated: 2023/05/01

Yongchun Li

Weijun Xie

Data-Mining

This paper studies a classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix with a given size out of a covariance matrix from a system. MESP has been widely applied to many areas, including healthcare, power system, manufacturing, data science, etc. Investigating its Lagrangian dual and primal characterization, we … Read more