Information Relaxations and Duality in Stochastic Dynamic Programs

We describe a dual approach to stochastic dynamic programming: we relax the constraint that the chosen policy must be temporally feasible and impose a penalty that punishes violations of temporal feasibility. We describe the theory underlying this dual approach and demonstrate its use in dynamic programming models related to inventory control, option pricing, and oil … Read more

Simultaneous Solution of Lagrangean Dual Problems Interleaved with Preprocessing for the Weight Constrained Shortest Path Problem

Conventional Lagrangean preprocessing for the network Weight Constrained Shortest Path Problem (WCSPP calculates lower bounds on the cost of using each node and edge in a feasible path using a single optimal Lagrange multiplier for the relaxation of the WCSPP. These lower bounds are used in conjunction with an upper bound to eliminate nodes and … Read more

A Q-Learning Algorithm with Continuous State Space

We study in this paper a Markov Decision Problem (MDP) with continuous state space and discrete decision variables. We propose an extension of the Q-learning algorithm introduced to solve this problem by Watkins in 1989 for completely discrete MDPs. Our algorithm relies on stochastic approximation and functional estimation, and uses kernels to locally update the … Read more

Nonserial dynamic programming and local decomposition algorithms in discrete programming

One of perspective ways to exploit sparsity in the dependency graph of an optimization problem as J.N. Hooker stressed is nonserial dynamic programming (NSDP) which allows to compute solution in stages, each of them uses results from previous stages. The class of discrete optimization problems with the block-tree-structure matrix of constraints is considered. Nonserial dynamic … Read more

New solution approaches to the general single machine earliness-tardiness problem

This paper addresses the general single-machine earliness-tardiness problem with distinct release dates, due dates, and unit costs. The aim of this research is to obtain an exact nonpreemptive solution in which machine idle time is allowed. In a hybrid approach, we formulate and then solve the problem using dynamic programming (DP) while incorporating techniques from … Read more

A pricing problem under Monge property

We study a pricing problem where buyers with non-uniform demand purchase one of many items. Each buyer has a known benefit for each item and purchases the item that gives the largest utility, which is defined to be the difference between the benefit and the price of the item. The optimization problem is to decide … Read more

Temporal difference learning with kernels for pricing american-style options

We propose in this paper to study the problem of estimating the cost-to-go function for an infinite-horizon discounted Markov chain with possibly continuous state space. For implementation purposes, the state space is typically discretized. As soon as the dimension of the state space becomes large, the computation is no more practicable, a phenomenon referred to … Read more

Optimal Information Monitoring Under a Politeness Constraint

We describe scheduling algorithms for monitoring an information source whose contents change at times modeled by a nonhomogeneous Poisson process. In a given time period of length T, we enforce a politeness constraint that we may only probe the source at most n times. This constraint, along with an optional constraint that no two probes … Read more

New hybrid optimization algorithms for machine scheduling problems

Dynamic programming, branch-and-bound, and constraint programming are the standard solution principles for finding optimal solutions to machine scheduling problems. We propose a new hybrid optimization framework that integrates all three methodologies. The hybrid framework leads to powerful solution procedures. We demonstrate our approach through the optimal solution of the single-machine total weighted completion time scheduling … Read more

Aggregation in Stochastic Dynamic Programming

We present a general aggregation method applicable to all finite-horizon Markov decision problems. States of the MDP are aggregated into macro-states based on a pre-selected collection of “distinguished” states which serve as entry points into macro-states. The resulting macro-problem is also an MDP, whose solution approximates an optimal solution to the original problem. The aggregation … Read more