Exact converging bounds for Stochastic Dual Dynamic Programming via Fenchel duality

The Stochastic Dual Dynamic Programming (SDDP) algorithm has become one of the main tools to address convex multistage stochastic optimal control problem. Recently a large amount of work has been devoted to improve the convergence speed of the algorithm through cut-selection and regularization, or to extend the field of applications to non-linear, integer or risk-averse … Read more

An algorithm for solving infinite horizon Markov dynamic programmes

We consider a general class of infinite horizon dynamic programmes where state and control sets are convex and compact subsets of Euclidean spaces and (convex) costs are discounted geometrically. The aim of this work is to provide a convergence result for these problems under as few restrictions as possible. Under certain assumptions on the cost … Read more

Outer Approximation for Integer Nonlinear Programs via Decision Diagrams

As an alternative to traditional integer programming (IP), decision diagrams (DDs) provide a new solution technology for discrete problems based on their combinatorial structure and dynamic programming representation. While the literature mainly focuses on the competitive aspects of DDs as a stand-alone solver, we investigate their complementary role by studying IP techniques that can be … Read more

Network Models for Multiobjective Discrete Optimization

This paper provides a novel framework for solving multiobjective discrete optimization problems with an arbitrary number of objectives. Our framework formulates these problems as network models, in that enumerating the Pareto frontier amounts to solving a multicriteria shortest path problem in an auxiliary network. We design tools and techniques for exploiting the network model in … Read more

A deterministic algorithm for solving stochastic minimax dynamic programmes

In this paper, we present an algorithm for solving stochastic minimax dynamic programmes where state and action sets are convex and compact. A feature of the formulations studied is the simultaneous non-rectangularity of both `min’ and `max’ feasibility sets. We begin by presenting convex programming upper and lower bound representations of saddle functions — extending … Read more

Network-based Approximate Linear Programming for Discrete Optimization

We develop a new class of approximate linear programs (ALPs) that project the high-dimensional value function of dynamic programs onto a class of basis functions, each defined as a network that represents aggregrations over the state space. The resulting ALP is a minimum-cost flow problem over an extended variable space that synchronizes flows across multiple … Read more

Approximations to Stochastic Dynamic Programs via Information Relaxation Duality

In the analysis of complex stochastic dynamic programs, we often seek strong theoretical guarantees on the suboptimality of heuristic policies. One technique for obtaining performance bounds is perfect information analysis: this approach provides bounds on the performance of an optimal policy by considering a decision maker who has access to the outcomes of all future … Read more

Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces

We describe a nonlinear generalization of dual dynamic programming theory and its application to value function estimation for deterministic control problems over continuous state and action (or input) spaces, in a discrete-time infinite horizon setting. We prove that the result of a one-stage policy evaluation can be used to produce nonlinear lower bounds on the … Read more

A Bucket Graph Based Labeling Algorithm with Application to Vehicle Routing

We consider the Resource Constrained Shortest Path problem arising as a subproblem in state-of-the-art Branch-Cut-and-Price algorithms for vehicle routing problems. We propose a variant of the bi-directional label correcting algorithm in which the labels are stored and extended according to so-called bucket graph. Such organization of labels helps to decrease significantly the number of dominance … Read more

Primal-Dual π Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems

Consider the problem of approximating the optimal policy of a Markov decision process (MDP) by sampling state transitions. In contrast to existing reinforcement learning methods that are based on successive approximations to the nonlinear Bellman equation, we propose a Primal-Dual π Learning method in light of the linear duality between the value and policy. The … Read more