This paper investigates the problem of minimizing a smooth function over a compact set with a probabilistic relative-error gradient oracle.
The oracle succeeds with some probability, in which case it provides a relative-error approximation of the true gradient, or fails and returns an arbitrary vector, while the optimizer cannot distinguish between successful and failed queries throughout the optimization process.
This oracle framework encompasses a wide range of gradient approximation scenarios, including stochastic gradients without moment bounds, gradient quantization, zero-order and derivative-free approximations.
We analyze the theoretical performance of the Projected and Conditional Gradient methods under this setting for both convex and nonconvex objectives.
Notably, we show that under separability of the oracle and the feasible set, the presence of relative error does not hinder the convergence of these methods.
Additionally, we introduce and analyze two specialized conditional gradient variants: a probabilistic sign-based method and a scaled conditional gradient method for optimization over a class of sets containing norm balls.
Our results have direct implications for the convergence behavior of stochastic, possibly biased, gradients with heavy-tail distributed error.
Article
Loading...