This paper focuses on decentralized stochastic bilevel optimization (DSBO) where agents only communicate with their neighbors. We propose Decentralized Stochastic Gradient Descent and Ascent with Gradient Tracking (DSGDA-GT), a novel algorithm that only requires first-order oracles that are much cheaper than second-order oracles widely adopted in existing works. We further provide a finite-time convergence analysis showing that for \(n\) agents collaboratively solving the DSBO problem, the sample complexity of finding an \(\epsilon\)-stationary point in our algorithm is \(\mathcal{O}(n^{-1}\epsilon^{-7})\), which matches the currently best-known results of the single-agent counterpart with linear speedup. The numerical experiments demonstrate both the communication and training efficiency of our algorithm.
Fully First-Order Methods for Decentralized Bilevel Optimization
\(\)