Accelerated Stochastic Peaceman-Rachford Method for Empirical Risk Minimization

This work is devoted to studying an Accelerated Stochastic Peaceman-Rachford Splitting Method (AS-PRSM) for solving a family of structural empirical risk minimization problems. The objective function to be optimized is the sum of a possibly nonsmooth convex function and a finite-sum of smooth convex component functions. The smooth subproblem in AS-PRSM is solved by a stochastic gradient method using variance reduction … Read more

A Homogeneous Predictor-Corrector Algorithm for Stochastic Nonsymmetric Convex Conic Optimization With Discrete Support

We consider a stochastic convex optimization problem over nonsymmetric cones with discrete support. This class of optimization problems has not been studied yet. By using a logarithmically homogeneous self-concordant barrier function, we present a homogeneous predictor-corrector interior-point algorithm for solving stochastic nonsymmetric conic optimization problems. We also derive an iteration bound for the proposed algorithm. … Read more

Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient

We present PDLP, a practical first-order method for linear programming (LP) that can solve to the high levels of accuracy that are expected in traditional LP applications. In addition, it can scale to very large problems because its core operation is matrix-vector multiplications. PDLP is derived by applying the primal-dual hybrid gradient (PDHG) method, popularized … Read more

Optimal Convergence Rates for the Proximal Bundle Method

We study convergence rates of the classic proximal bundle method for a variety of nonsmooth convex optimization problems. We show that, without any modification, this algorithm adapts to converge faster in the presence of smoothness or a Hölder growth condition. Our analysis reveals that with a constant stepsize, the bundle method is adaptive, yet it … Read more

Universal Conditional Gradient Sliding for Convex Optimization

In this paper, we present a first-order projection-free method, namely, the universal conditional gradient sliding (UCGS) method, for solving ε-approximate solutions to convex differentiable optimization problems. For objective functions with Hölder continuous gradients, we show that UCGS is able to terminate with ε-solutions with at most O((1/ε)^(2/(1+3v))) gradient evaluations and O((1/ε)^(4/(1+3v))) linear objective optimizations, where … Read more

A Framework for Multi-stage Bonus Allocation in Meal-Delivery Platform

Online meal delivery is undergoing explosive growth, as this service is becoming increasingly fashionable. A meal delivery platform aims to provide efficient services for customers and restaurants. However, in reality, several hundred thousand orders are canceled per day in the Meituan meal delivery platform since they are not accepted by the crowdsoucing drivers, which is … Read more

Variants of the A-HPE and large-step A-HPE algorithms for strongly convex problems with applications to accelerated high-order tensor methods

For solving strongly convex optimization problems, we propose and study the global convergence of variants of the A-HPE and large-step A-HPE algorithms of Monteiro and Svaiter. We prove \emph{linear} and the \emph{superlinear} $\mathcal{O}\left(k^{\,-k\left(\frac{p-1}{p+1}\right)}\right)$ global rates for the proposed variants of the A-HPE and large-step A-HPE methods, respectively. The parameter $p\geq 2$ appears in the (high-order) … Read more

Some Modified Fast Iteration Shrinkage Thresholding Algorithms with a New Adaptive Non-monotone Stepsize Strategy for Nonsmooth and Convex Minimization Problems

The ” fast iterative shrinkage-thresholding algorithm ” (FISTA) is one of the most famous first order optimization scheme, and the stepsize, which plays an important role in theoretical analysis and numerical experiment, is always determined by a constant related to the Lipschitz constant or by a backtracking strategy. In this paper, we design a new … Read more

Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis

Average-case analysis computes the complexity of an algorithm averaged over all possible inputs. Compared to worst-case analysis, it is more representative of the typical behavior of an algorithm, but remains largely unexplored in optimization. One difficulty is that the analysis can depend on the probability distribution of the inputs to the model. However, we show … Read more

Convergence of Proximal Gradient Algorithm in the Presence of Adjoint Mismatch

We consider the proximal gradient algorithm for solving penalized least-squares minimization problems arising in data science. This first-order algorithm is attractive due to its flexibility and minimal memory requirements allowing to tackle large-scale minimization problems involving non-smooth penalties. However, for problems such as X-ray computed tomography, the applicability of the algorithm is dominated by the … Read more