A Single Time-Scale Stochastic Approximation Method for Nested Stochastic Optimization

We study constrained nested stochastic optimization problems in which the objective function is a composition of two smooth functions whose exact values and derivatives are not available. We propose a single time-scale stochastic approximation algorithm, which we call the Nested Averaged Stochastic Approximation (NASA), to find an approximate stationary point of the problem. The algorithm has two auxiliary averaged sequences (filters) which estimate the gradient of the composite objective function and the inner function value. By using a special Lyapunov function, we show that NASA achieves the sample complexity of ${\cal O}(1/\epsilon^2)$ for finding an $\epsilon$-approximate stationary point, thus outperforming all extant methods for nested stochastic approximation. Our method and its analysis are the same for both unconstrained and constrained problems, without any need of batch samples for constrained nonconvex stochastic optimization. We also present a simplified variant of the NASA method for solving constrained single level stochastic optimization problems and we prove the same complexity result for both unconstrained and constrained problems.

Article

Download

View A Single Time-Scale Stochastic Approximation Method for Nested Stochastic Optimization