Gradient methods for convex minimization: better rates under weaker conditions

The convergence behavior of gradient methods for minimizing convex differentiable functions is one of the core questions in convex optimization. This paper shows that their well-known complexities can be achieved under conditions weaker than the commonly assumed ones. We relax the common gradient Lipschitz-continuity condition and strong convexity condition to ones that hold only over certain line segments. Specifically, we establish complexities O(R/eps) and O(sqrt{R/eps}) for the ordinary and accelerate gradient methods, respectively, assuming that grad_f is Lipschitz continuous with constant $R$ over the line segment joining x and x-grad_f/R for each x in the domain of f. Then we improve them to O((R/nu) log(1/eps)) and O(sqrt{R/nu} log(1/eps)) for function f that also satisfies the secant inequality >= nu ||x-x*||^2 for each x in the domain of f and its projection x* to the minimizer set of f. The secant condition is also shown to be necessary for the geometric decay of solution error. Not only are the relaxed conditions met by more functions, the restrictions give smaller R and larger nu than they are without the restrictions and thus lead to better complexity bounds. We apply these results to sparse optimization and demonstrate a faster algorithm.

Citation

Rice CAAM Tech Report TR13-04

Article

Download

View PDF