structured bandits – Optimization Online

Optimal Learning for Structured Bandits

Published: 2020/02/09, Updated: 2020/07/14

Negin Golrezaei
Bart Paul Gerard van Parys

Convex Optimization, Semi-infinite Programming, Stochastic Programming duality, online learning, structured bandits

We study structured multi-armed bandits, which is the problem of online decision-making under uncertainty in the presence of structural information. In this problem, the decision-maker needs to discover the best course of action despite observing only uncertain rewards over time. The decision- maker is aware of certain structural information regarding the reward distributions and would … Read more