reinforcement learning – Optimization Online

From Optimization to Control: Quasi Policy Iteration

Published: 2023/11/27

Recent control algorithms for Markov decision processes (MDPs) have been designed using an implicit analogy with well-established optimization algorithms. In this paper, we make this analogy explicit across four problem classes with a unified solution characterization. This novel framework, in turn, allows for a systematic transformation of algorithms from one domain to the other. In … Read more

Dynamic courier capacity acquisition in rapid delivery systems: a deep Q-learning approach

Published: 2022/01/25

Ramon Auad

Alan L. Erera

Martin Savelsbergh

Applications - OR and Management Sciences, Production and Logistics, Transportation capacity management, Deep Q-learning, last-mile delivery, logistics, Rapid delivery, reinforcement learning

With the recent boom of the gig economy, urban delivery systems have experienced substantial demand growth. In such systems, orders are delivered to customers from local distribution points respecting a delivery time promise. An important example is a restaurant meal delivery system, where delivery times are expected to be minutes after an order is placed. … Read more

Batch Learning in Stochastic Dual Dynamic Programming

Published: 2021/05/17

Daniel Ávila

Nils Löhndorf

Anthony Papavasiliou

Dynamic Programming, Stochastic Programming dynamic programming, parallel computing, reinforcement learning, sddp, stochastic programming

We consider the stochastic dual dynamic programming (SDDP) algorithm, which is a widely employed algorithm applied to multistage stochastic programming, and propose a variant using batch learning, a technique used with success in the reinforcement learning framework. We cast SDDP as a type of Q-learning algorithm and describe its application in both risk neutral and … Read more

An Adaptive and Near Parameter-free BRKGA Using Q-Learning Method

Published: 2021/02/22, Updated: 2021/04/21

Antonio Chaves

Luiz Henrique Lorena

Meta Heuristics genetic algorithm, parameter control, q-learning, reinforcement learning

The Biased Random-Key Genetic Algorithm (BRKGA) is an efficient metaheuristic to solve combinatorial optimization problems but requires parameter tuning so the intensification and diversification of the algorithm work in a balanced way. There is, however, not only one optimal parameter configuration, and the best configuration may differ according to the stages of the evolutionary process. … Read more

SDP-based bounds for the Quadratic Cycle Cover Problem via cutting plane augmented Lagrangian methods and reinforcement learning

Published: 2020/09/08, Updated: 2021/02/18

Frank de Meijer

Renata Sotirov

Combinatorial Optimization, Semi-definite Programming cutting planes, dykstra's projection algorithm, facial reduction, quadratic cycle cover problem, reinforcement learning, semidefinite programming

We study the Quadratic Cycle Cover Problem (QCCP), which aims to find a node-disjoint cycle cover in a directed graph with minimum interaction cost between successive arcs. We derive several semidefinite programming (SDP) relaxations and use facial reduction to make these strictly feasible. We investigate a nontrivial relationship between the transformation matrix used in the … Read more

Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning

Published: 2016/12/07

Yichen Chen

Mengdi Wang

Dynamic Programming reinforcement learning, stochastic primal-dual methods

We study the online estimation of the optimal policy of a Markov decision process (MDP). We propose a class of Stochastic Primal-Dual (SPD) methods which exploit the inherent minimax duality of Bellman equations. The SPD methods update a few coordinates of the value and policy estimates as a new state transition is observed. These methods … Read more