A Derivation of Nesterov’s Accelerated Gradient Algorithm from Optimal Control Theory

Nesterov’s accelerated gradient algorithm is derived from first principles. The first principles are founded on the recently-developed optimal control theory for optimization. The necessary conditions for optimal control generate a controllable dynamical system for accelerated optimization. Stabilizing this system via a control Lyapunov function generates an ordinary differential equation. An Euler discretization of the differential … Read more

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

By exploiting double-penalty terms for the primal subproblem, we develop a novel relaxed augmented Lagrangian method for solving a family of convex optimization problems subject to equality or inequality constraints. This new method is then extended to solve a general multi-block separable convex optimization problem, and two related primal-dual hybrid gradient algorithms are also discussed. … Read more

Accelerated Stochastic Peaceman-Rachford Method for Empirical Risk Minimization

This work is devoted to studying an Accelerated Stochastic Peaceman-Rachford Splitting Method (AS-PRSM) for solving a family of structural empirical risk minimization problems. The objective function to be optimized is the sum of a possibly nonsmooth convex function and a finite-sum of smooth convex component functions. The smooth subproblem in AS-PRSM is solved by a stochastic gradient method using variance reduction … Read more

Exact Convergence Rates of Alternating Projections for Nontransversal Intersections

We study the exact convergence rate of the alternating projection method for the nontransversal intersection of a semialgebraic set and a linear subspace. If the linear subspace is a line, the exact rates are expressed by multiplicities of the defining polynomials of the semialgebraic set, or related power series. Our methods are also applied to … Read more

An optimization problem for dynamic OD trip matrix estimation on transit networks with different types of data collection units

Dynamic O-D trip matrices for public transportation systems provide a valuable source of information of the usage of public transportation system that may be used either by planners for a better design of the transportation facilities or by the administrations in order to characterize the efficiency of the transport system both in peak hours and … Read more

Rank computation in Euclidean Jordan algebras

Euclidean Jordan algebras are the abstract foundation for symmetriccone optimization. Every element in a Euclidean Jordan algebra has a complete spectral decomposition analogous to the spectral decomposition of a real symmetric matrix into rank-one projections. The spectral decomposition in a Euclidean Jordan algebra stems from the likewise-analogous characteristic polynomial of its elements, whose degree is … Read more

FISTA and Extensions – Review and New Insights

The purpose of this technical report is to review the main properties of an accelerated composite gradient (ACG) method commonly referred to as the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA). In addition, we state a version of FISTA for solving both convex and strongly convex composite minimization problems and derive its iteration complexities to generate iterates … Read more

A new stopping criterion for Krylov solvers applied in Interior Point Methods

A surprising result is presented in this paper with possible far reaching consequences for any optimization technique which relies on Krylov subspace methods employed to solve the underlying linear equation systems. In this paper the advantages of the new technique are illustrated in the context of Interior Point Methods (IPMs). When an iterative method is … Read more

Frank-Wolfe and friends: a journey into projection-free first-order optimization methods

Invented some 65 years ago in a seminal paper by Marguerite Straus-Frank and Philip Wolfe, the Frank-Wolfe method recently enjoys a remarkable revival, fuelled by the need of fast and reliable first-order optimization methods in Data Science and other relevant application areas. This review tries to explain the success of this approach by illustrating versatility … Read more

Analysis of the Frank-Wolfe Method for Convex Composite Optimization involving a Logarithmically-Homogeneous Barrier

We present and analyze a new generalized Frank-Wolfe method for the composite optimization problem (P): F*:= min_x f(Ax) + h(x), where f is a \theta-logarithmically-homogeneous self-concordant barrier and the function h has bounded domain but is possibly non-smooth. We show that our generalized Frank-Wolfe method requires O((Gap_0 + \theta + Var_h)\ln(\delta_0) + (\theta + Var_h)^2/\epsilon) … Read more