Data-Mining – Page 3 – Optimization Online

Solving a Class of Cut-Generating Linear Programs via Machine Learning

Published: 2021/08/15, Updated: 2023/10/30

Branch and Cut Algorithms, Data-Mining, Linear Programming cut-generating linear programs, cutting planes, data classification, function approximation, machine learning

Cut-generating linear programs (CGLPs) play a key role as a separation oracle to produce valid inequalities for the feasible region of mixed-integer programs. When incorporated inside branch-and-bound, the cutting planes obtained from CGLPs help to tighten relaxations and improve dual bounds. However, running the CGLPs at the nodes of the branch-and-bound tree is computationally cumbersome … Read more

A stochastic alternating balance k-means algorithm for fair clustering

Published: 2021/06/02

Suyun Liu

Luis Nunes Vicente

Convex and Nonsmooth Optimization, Data-Mining, Nonlinear Optimization bi-objective optimization, data mining, fairness, k-means clustering, pareto front, unsupervised machine learning

In the application of data clustering to human-centric decision-making systems, such as loan applications and advertisement recommendations, the clustering outcome might discriminate against people across different demographic groups, leading to unfairness. A natural conflict occurs between the cost of clustering (in terms of distance to cluster centers) and the balance representation of all demographic groups … Read more

Beyond Symmetry: Best Submatrix Selection for the Sparse Truncated SVD

Published: 2021/05/07, Updated: 2024/09/30

Yongchun Li

Weijun Xie

Combinatorial Optimization, Data-Mining, Integer Programming approximation algorithms, misdp, sparse pca, sparse svd, truncated svd

Truncated singular value decomposition (SVD), also known as the best low-rank matrix approximation, has been successfully applied to many domains such as biology, healthcare, and others, where high-dimensional datasets are prevalent. To enhance the interpretability of the truncated SVD, sparse SVD (SSVD) is introduced to select a few rows and columns of the original matrix … Read more

Unbiased Subdata Selection for Fair Classification: A Unified Framework and Scalable Algorithms

Published: 2020/12/22, Updated: 2020/12/23

Ye Qing

Weijun Xie

(Mixed) Integer Nonlinear Programming, Data-Mining

As an important problem in modern data analytics, classification has witnessed varieties of applications from different domains. Different from conventional classification approaches, fair classification concerns the issues of unintentional biases against the sensitive features (e.g., gender, race). Due to high nonconvexity of fairness measures, existing methods are often unable to model exact fairness, which can … Read more

A dynamic programming approach to segmented isotonic regression

Published: 2020/12/08

Data-Mining, Dynamic Programming, Energy cardinality-constrained shortest path problem, consumers' price-response, data clustering, inverse problems, isotonic regression, segmented regression

This paper proposes a polynomial-time algorithm to construct the monotone stepwise curve that minimizes the sum of squared errors with respect to a given cloud of data points. The fitted curve is also constrained on the maximum number of steps it can be composed of and on the minimum step length. Our algorithm relies on … Read more

Accuracy and fairness trade-offs in machine learning: A stochastic multi-objective approach

Published: 2020/08/03, Updated: 2020/09/03

Suyun Liu

Luis Nunes Vicente

Data-Mining, Multi-Criteria Optimization, Stochastic Programming disparate impact, equal opportunity, fairness, machine learning, multi-objective optimization, nonconvex optimization, pareto fronts, sensitive/protected attributes, stochastic approximation, supervised learning

In the application of machine learning to real life decision-making systems, e.g., credit scoring and criminal justice, the prediction outcomes might discriminate against people with sensitive attributes, leading to unfairness. The commonly used strategy in fair machine learning is to include fairness as a constraint or a penalization term in the minimization of the prediction … Read more

The block mutual coherence property condition for signal recovery

Published: 2020/07/10

Data-Mining, Nonsmooth Optimization, Statistics

Compressed sensing shows that a sparse signal can stably be recovered from incomplete linear measurements. But, in practical applications, some signals have additional structure, where the nonzero elements arise in some blocks. We call such signals as block-sparse signals. In this paper, the $\ell_2/\ell_1-\alpha\ell_2$ minimization method for the stable recovery of block-sparse signals is investigated. … Read more

The block mutual coherence property condition for signal recovery

Published: 2020/06/10

Data-Mining, Nonlinear Optimization

Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks

Published: 2020/05/22, Updated: 2022/05/06

(Mixed) Integer Nonlinear Programming, Data-Mining, Network Optimization bayesian networks, consistency, directed acyclic graphs, early stopping criterion, mixed-integer conic programming

Bayesian Networks (BNs) represent conditional probability relations among a set of random variables (nodes) in the form of a directed acyclic graph (DAG), and have found diverse applications in knowledge discovery. We study the problem of learning the sparse DAG structure of a BN from continuous observational data. The central problem can be modeled as … Read more

The high-order block RIP for non-convex block-sparse compressed sensing

Published: 2020/03/07

Data-Mining, Statistics

This paper concentrates on the recovery of block-sparse signals, which is not only sparse but also nonzero elements are arrayed into some blocks (clusters) rather than being arbitrary distributed all over the vector, from linear measurements. We establish high-order sufficient conditions based on block RIP to ensure the exact recovery of every block $s$-sparse signal … Read more