Sparse PCA With Multiple Components

Sparse Principal Component Analysis (sPCA) is a cardinal technique for obtaining combinations of features, or principal components (PCs), that explain the variance of high-dimensional datasets in an interpretable manner. This involves solving a sparsity and orthogonality-constrained convex maximization problem, which is extremely computationally challenging. Most existing works address sparse PCA via methods—such as iteratively computing … Read more

Fixed-Point Automatic Differentiation of Forward–Backward Splitting Algorithms for Partly Smooth Functions

A large class of non-smooth practical optimization problems can be written as minimization of a sum of smooth and partly smooth functions. We consider such structured problems which also depend on a parameter vector and study the problem of differentiating its solution mapping with respect to the parameter which has far reaching applications in sensitivity … Read more

On polynomial time solvability of combinatorial Markov random fields

The problem of inferring Markov random fields (MRFs) with a sparsity or robustness prior can be naturally modeled as a mixed-integer program. This motivates us to study a general class of convex submodular optimization problems with indicator variables, which we show to be polynomially solvable in this paper. The key insight is that, possibly after … Read more

Optimized convergence of stochastic gradient descent by weighted averaging

Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the arithmetic mean of all iterates converges considerably slower to the optimal solution than the iterates themselves. And also in the … Read more

Submodularity, pairwise independence and correlation gap

In this paper, we provide a characterization of the expected value of monotone submodular set functions with $n$ pairwise independent random inputs. Inspired by the notion of “correlation gap”, we study the ratio of the maximum expected value of a function with arbitrary dependence among the random inputs with given marginal probabilities to the maximum … Read more

A Simplified Convergence Theory for Byzantine Resilient Stochastic Gradient Descent

In distributed learning, a central server trains a model according to updates provided by nodes holding local data samples. In the presence of one or more malicious servers sending incorrect information (a Byzantine adversary), standard algorithms for model training such as stochastic gradient descent (SGD) fail to converge. In this paper, we present a simplified … Read more

Stochastic nested primal-dual method for nonconvex constrained composition optimization

In this paper we study the nonconvex constrained composition optimization, in which the objective contains a composition of two expected-value functions whose accurate information is normally expensive to calculate. We propose a STochastic nEsted Primal-dual (STEP) method for such problems. In each iteration, with an auxiliary variable introduced to track the inner layer function values … Read more

Computing Tchebychev weight space decomposition for multiobjective discrete optimization problems

Multiobjective discrete optimization (MODO) techniques, including weight space decomposition, have received increasing attention in the last decade. The primary weight space decomposition technique in the literature is defined for the weighted sum utility function, through which sets of weights are assigned to a subset of the nondominated set. Recent work has begun to study the … Read more