Global Convergence in Deep Learning with Variable Splitting via the Kurdyka-{\L}ojasiewicz Property

Deep learning has recently attracted a significant amount of attention due to its great empirical success. However, the effectiveness in training deep neural networks (DNNs) remains a mystery in the associated nonconvex optimizations. In this paper, we aim to provide some theoretical understanding on such optimization problems. In particular, the Kurdyka-{\L}ojasiewicz (KL) property is established … Read more

Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods

Logistic regression is one of the most popular methods in binary classification, wherein estimation of model parameters is carried out by solving the maximum likelihood (ML) optimization problem, and the ML estimator is defined to be the optimal solution of this problem. It is well known that the ML estimator exists when the data is … Read more

Predicting the vibroacoustic quality of steering gears

In the daily operations of ThyssenKrupp Presta AG, ball nut assemblies (BNA) undergo a vibroacoustical quality test and are binary classified based on their order spectra. In this work we formulate a multiple change point problem and derive optimal quality intervals and thresholds for the order spectra that minimize the number of incorrectly classified BNA. … Read more

Correlation analysis between the vibroacoustic behavior of steering gear and ball nut assemblies in the automotive industry

The increase in quality standards in the automotive industry requires specifications to be propagated across the supply chain, a challenge exacerbated in domains where the quality is subjective. In the daily operations of ThyssenKrupp Presta AG, requirements imposed on the vibroacoustic quality of steering gear need to be passed down to their subcomponents. We quantify … Read more

Scalable Algorithms for the Sparse Ridge Regression

Sparse regression and variable selection for large-scale data have been rapidly developed in the past decades. This work focuses on sparse ridge regression, which enforces the sparsity by use of the L0 norm. We first prove that the continuous relaxation of the mixed integer second order conic (MISOC) reformulation using perspective formulation is equivalent to … Read more

Robust Principal Component Analysis using Facial Reduction

We study algorithms for robust principal component analysis (RPCA) for a partially observed data matrix. The aim is to recover the data matrix as a sum of a low-rank matrix and a sparse matrix so as to eliminate erratic noise (outliers). This problem is known to be NP-hard in general. A classical way to solve … Read more

A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization

We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or work only for specific regularizers. Our algorithm uses successive quadratic approximations, and we describe how to … Read more

Cut-Pursuit Algorithm for Regularizing Nonsmooth Functionals with Graph Total Variation

We present an extension of the cut-pursuit algorithm, introduced by Landrieu and Obozinski (2017), to the graph total-variation regularization of functions with a separable nondifferentiable part. We propose a modified algorithmic scheme as well as adapted proofs of convergence. We also present a heuristic approach for handling the cases in which the values associated to … Read more

Smart “Predict, then Optimize”

Many real-world analytics problems involve two significant challenges: prediction and optimization. Due to the typically complex nature of each challenge, the standard paradigm is to predict, then optimize. By and large, machine learning tools are intended to minimize prediction error and do not account for how the predictions will be used in a downstream optimization … Read more

Approximate Positively Correlated Distributions and Approximation Algorithms for D-optimal Design

Experimental design is a classical problem in statistics and has also found new applications in machine learning. In the experimental design problem, the aim is to estimate an unknown vector x in m-dimensions from linear measurements where a Gaussian noise is introduced in each measurement. The goal is to pick k out of the given … Read more