distributed stochastic gradient descent

Statistical Inference for Distributed Contextual Multi-armed Bandit

Published: 2025/06/06

Wuwenqing Yan
Yongchao Liu

Stochastic Programming $\varepsilon$-greedy, confidence intervals, distributed contextual multi-armed bandit, distributed stochastic gradient descent, random scaling

In this paper, we study the online statistical inference of distributed contextual multi-armed bandit problems, where the agents collaboratively learn an optimal policy by exchanging their local estimates of the global parameters with neighbors over a communication network. We propose a distributed online decision making algorithm, which balances the exploration and exploitation dilemma via the … Read more