A Majorization-Minimization approach for multiclass classification in a big data scenario

This work presents a novel optimization approach for training linear classifiers in multiclass classification tasks, when focusing on a regularized and smooth Weston-Watkins support vector machine (SVM) model. We propose a Majorization-Minimization (MM) algorithm to solve the resulting, Lipschitz-differentiable, optimization problem. To enhance scalability of the algorithm when tackling large datasets, we introduce an incremental … Read more

A Geometric Perspective on Polynomially Solvable Convex Maximization

Convex maximization encompasses a broad class of optimization problems and is generally NP-hard, even for low-rank objectives. This paper investigates structural conditions under which convex maximization becomes polynomially solvable. From a geometric perspective, we introduce comonotonicity, a structural property of the feasible region crucial for problem tractability, and establish mathematical characterizations of this property. Under comonotonicity and … Read more

An Inexact Modified Quasi-Newton Method for Nonsmooth Regularized Optimization

We introduce method iR2N, a modified proximal quasi-Newton method for minimizing the sum of a \(C^1\) function \(f\) and a lower semi-continuous prox-bounded \(h\) that permits inexact evaluations of \(f\), \(\nabla f\) and of the relevant proximal operators. Both \(f\) and \(h\) may be nonconvex. In applications where the proximal operator of \(h\) is not … Read more

The Convexity Zoo: A Taxonomy of Function Classes in Optimization

The tractability of optimization problems depends critically on structural properties of the objective function. Convexity guarantees global optimality of local solutions and enables polynomial-time algorithms under mild assumptions, but many problems arising in modern applications—particularly in machine learning—are inherently nonconvex. Remarkably, a large class of such problems remains amenable to efficient optimization due to additional … Read more

Primal-dual resampling for solution validation in convex stochastic programming

Suppose we wish to determine the quality of a candidate solution to a convex stochastic program in which the objective function is a statistical functional parameterized by the decision variable and known deterministic constraints may be present. Inspired by stopping criteria in primal-dual and interior-point methods, we develop cancellation theorems that characterize the convergence of … Read more

Fast and Simple Multiclass Data Segmentation: An Eigendecomposition and Projection-Free Approach

Graph-based machine learning has seen an increased interest over the last decade with many connections to other fields of applied mathematics. Learning based on partial differential equations, such as the phase-field Allen-Cahn equation, allows efficient handling of semi-supervised learning approaches on graphs. The numerical solution of the graph Allen-Cahn equation via a convexity splitting or … Read more

Iterative Sampling Methods for Sinkhorn Distributionally Robust Optimization

Distributionally robust optimization (DRO) has emerged as a powerful paradigm for reliable decision-making under uncertainty. This paper focuses on DRO with ambiguity sets defined via the Sinkhorn discrepancy: an entropy-regularized Wasserstein distance, referred to as Sinkhorn DRO. Existing work primarily addresses Sinkhorn DRO from a dual perspective, leveraging its formulation as a conditional stochastic optimization … Read more

An Elementary Proof of the Near Optimality of LogSumExp Smoothing

We consider the design of smoothings of the (coordinate-wise) max function in $\mathbb{R}^d$ in the infinity norm. The LogSumExp function $f(x)=\ln(\sum^d_i\exp(x_i))$ provides a classical smoothing, differing from the max function in value by at most $\ln(d)$. We provide an elementary construction of a lower bound, establishing that every overestimating smoothing of the max function must … Read more

A Taxonomy of Multi-Objective Alignment Techniques for Large Language Models

Aligning large language models (LLMs) with human preferences has evolved from single-objective reward maximization to sophisticated multi-objective optimization. Real-world deployment requires balancing competing objectiveshelpfulness, harmlessness, honesty, instruction-following, and task-specic capabilitiesthat often conict. This survey provides a systematic taxonomy of multi-objective alignment techniques, organizing the rapidly growing literature into four categories: (1) Reward Decomposition approaches that … Read more

Data-Dependent Complexity of First-Order Methods for Binary Classification

Large-scale problems in data science are often modeled with optimization, and the optimization model is usually solved with first-order methods that may converge at a sublinear rate. Therefore, it is of interest to terminate the optimization algorithm as soon as the underlying data science task is accomplished. We consider FISTA for solving two binary classification … Read more