A mathematical introduction to SVMs with self-concordant kernel

A derivation of so-called “soft-margin support vector machines with kernel” is presented along with elementary proofs that do not rely on concepts from functional analysis such as Mercer’s theorem or reproducing kernel Hilbert spaces which are frequently cited in this context. The analysis leads to new continuity properties of the kernel functions, in particular a … Read more

Iteration Complexity of Fixed-Step Methods by Nesterov and Polyak for Convex Quadratic Functions

This note considers the momentum method by Polyak and the accelerated gradient method by Nesterov, both without line search but with fixed step length applied to strictly convex quadratic functions assuming that exact gradients are used and appropriate upper and lower bounds for the extreme eigenvalues of the Hessian matrix are known. Simple 2-d-examples show … Read more

Optimized convergence of stochastic gradient descent by weighted averaging

Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the arithmetic mean of all iterates converges considerably slower to the optimal solution than the iterates themselves. And also in the … Read more

A simple Introduction to higher order liftings for binary problems

A short, simple, and self-contained proof is presented showing that $n$-th lifting for the max-cut-polytope is exact. The proof re-derives the known observations that the max-cut-polytope is the projection of a higher-dimensional regular simplex and that this simplex coincides with the $n$-th semidefinite lifting. An extension to reduce the dimension of higher order liftings and … Read more

Variance Reduction of Stochastic Gradients Without Full Gradient Evaluation

A standard concept for reducing the variance of stochastic gradient approximations is based on full gradient evaluations every now and then. In this paper an approach is considered that — while approximating a local minimizer of a sum of functions — also generates approximations of the gradient and the function values without relying on full … Read more

Best case exponential running time of a branch-and-bound algorithm using an optimal semidefinite relaxation

Chvatal (1980) has given a simple example of a knapsack problem for which a branch-and-bound algorithm using domination and linear relaxations to eliminate subproblems will use an exponential number of steps in the best case. In this short note it is shown that Chvatals result remains true when the LP relaxation is replaced with a … Read more

Set-Completely-Positive Representations and Cuts for the Max-Cut Polytope and the Unit Modulus Lifting

This paper considers a generalization of the “max-cut-polytope” $\conv\{\ xx^T\mid x\in\real^n, \ \ |x_k| = 1 \ \hbox{for} \ 1\le k\le n\}$ in the space of real symmetric $n\times n$-matrices with all-ones-diagonal to a complex “unit modulus lifting” $\conv\{xx\HH\mid x\in\complex^n, \ \ |x_k| = 1 \ \hbox{for} \ 1\le k\le n\}$ in the space of … Read more

On Affine Invariant Descent Directions

This paper explores the existence of affine invariant descent directions for unconstrained minimization. While there may exist several affine invariant descent directions for smooth functions $f$ at a given point, it is shown that for quadratic functions there exists exactly one invariant descent direction in the strictly convex case and generally none in the nondegenerate … Read more

A Derivative-Free and Ready-to-Use NLP Solver for Matlab or Octave

This paper introduces a derivative-free and ready-to-use solver for nonlinear programs with nonlinear equality and inequality constraints (NLPs). Using finite differences and a sequential quadratic programming (SQP) approach, the algorithm aims at finding a local minimizer and no extra attempt is made to generate a globally optimal solution. Due to the use of finite differences, … Read more

The solution of Euclidean norm trust region SQP subproblems via second order cone programs, an overview and elementary introduction

It is well known that convex SQP subproblems with a Euclidean norm trust region constraint can be reduced to second order cone programs for which the theory of Euclidean Jordan-algebras leads to efficient interior-point algorithms. Here, a brief and self-contained outline of the principles of such an implementation is given. All identities relevant for the … Read more