Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits

\(\) Follow-The-Regularized-Leader (FTRL) algorithms often enjoy optimal regret for adversarial as well as stochastic bandit problems and allow for a streamlined analysis. However, FTRL algorithms require the solution of an optimization problem in every iteration and are thus computationally challenging. In contrast, Follow-The-Perturbed-Leader (FTPL) algorithms achieve computational efficiency by perturbing the estimates of the rewards … Read more

Distributionally Robust Optimization with Markovian Data

We study a stochastic program where the probability distribution of the uncertain problem parameters is unknown and only indirectly observed via finitely many correlated samples generated by an unknown Markov chain with d states. We propose a data-driven distributionally robust optimization model to estimate the problem’s objective function and optimal solution. By leveraging results from … Read more