On the Convergence and Complexity of Proximal Gradient and Accelerated Proximal Gradient Methods under Adaptive Gradient Estimation

In this paper, we propose a proximal gradient method and an accelerated proximal gradient method for solving composite optimization problems, where the objective function is the sum of a smooth and a convex, possibly nonsmooth, function. We consider settings where the smooth component is either a finite-sum function or an expectation of a stochastic function, … Read more

Faster stochastic cubic regularized Newton methods with momentum

Cubic regularized Newton (CRN) methods have attracted significant research interest because they offer stronger solution guarantees and lower iteration complexity. With the rise of the big-data era, there is growing interest in developing stochastic cubic regularized Newton (SCRN) methods that do not require exact gradient and Hessian evaluations. In this paper, we propose faster SCRN … Read more

Stochastic Approximation with Block Coordinate Optimal Stepsizes

We consider stochastic approximation with block-coordinate stepsizes and propose adaptive stepsize rules that aim to minimize the expected distance from the next iterate to an optimal point. These stepsize rules employ online estimates of the second moment of the search direction along each block coordinate. The popular Adam algorithm can be interpreted as a particular … Read more

Recursive Bound-Constrained AdaGrad with Applications to Multilevel and Domain Decomposition Minimization

Two OFFO (Objective-Function Free Optimization) noise tolerant algorithms are presented that handle bound constraints, inexact gradients and use second-order information when available. The first is a multi-level method exploiting a hierarchical description of the problem and the second is a domain-decomposition method covering the standard addditive Schwarz decompositions. Both are generalizations of the first-order AdaGrad … Read more

A Randomized Algorithm for Sparse PCA based on the Basic SDP Relaxation

Sparse Principal Component Analysis (SPCA) is a fundamental technique for dimensionality reduction, and is NP-hard. In this paper, we introduce a randomized approximation algorithm for SPCA, which is based on the basic SDP relaxation. Our algorithm has an approximation ratio of at most the sparsity constant with high probability, if called enough times. Under a … Read more

Constrained Enumeration of Lucky Tickets: Prime Digits, Uniqueness, and Greedy Heuristics

We revisit the classical Lucky Ticket (LT) enumeration problem, in which an even-digit number is called lucky if the sum of the digits of its first half equals to that of its second half. We introduce two new subclasses — SuperLucky Tickets (SLTs), where all digits are distinct, and LuckyPrime Tickets (LPTs), where all digits … Read more

Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise

In this paper, we propose practical normalized stochastic first-order methods with Polyak momentum, multi-extrapolated momentum, and recursive momentum for solving unconstrained optimization problems. These methods employ dynamically updated algorithmic parameters and do not require explicit knowledge of problem-dependent quantities such as the Lipschitz constant or noise bound. We establish first-order oracle complexity results for finding … Read more

A Variational Analysis Approach for Bilevel Hyperparameter Optimization with Sparse Regularization

We study a bilevel optimization framework for hyperparameter learning in variational models, with a focus on sparse regression and classification tasks. In particular, we consider a weighted elastic-net regularizer, where feature-wise regularization parameters are learned through a bilevel formulation. A key novelty of our approach is the use of a Forward-Backward (FB) reformulation of the … Read more

Toward Decision-Oriented Prognostics: An Integrated Estimate-Optimize Framework for Predictive Maintenance

Recent research increasingly integrates machine learning (ML) into predictive maintenance (PdM) to reduce operational and maintenance costs in data-rich operational settings. However, uncertainty due to model misspecification continues to limit widespread industrial adoption. This paper investigates a PdM framework in which sensor-driven prognostics inform decision-making under economic trade-offs within a finite decision space. We investigate … Read more

Two-way Cutting-plane Algorithm for Best Subset Selection Considering Multicollinearity

When linear dependence exists between some explanatory variables in a regression model, the estimates of regression coefficients become unstable, thereby making the interpretation of the estimation results unreliable. To eliminate such multicollinearity, we propose a high-performance method for selecting the best subset of explanatory variables for linear and logistic regression models. Specifically, we first derive … Read more