Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear (Sometimes Sublinear) Running Time

We propose a randomized linear programming algorithm for approximating the optimal policy of the discounted Markov decision problem. By leveraging the value-policy duality, the algorithm adaptively samples state transitions and makes exponentiated primal-dual updates. We show that it finds an ε-optimal policy using nearly-linear running time in the worst case. For Markov decision processes that … Read more

From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

We consider linear programming (LP) problems in infinite dimensional spaces that are in general computationally intractable. Under suitable assumptions, we develop an approximation bridge from the infinite-dimensional LP to tractable finite convex programs in which the performance of the approximation is quantified explicitly. To this end, we adopt the recent developments in two areas of … Read more

Dynamic programming algorithms, efficient solution of the LP-relaxation and approximation schemes for the Penalized Knapsack Problem

We consider the 0-1 Penalized Knapsack Problem (PKP). Each item has a profit, a weight and a penalty and the goal is to maximize the sum of the profits minus the greatest penalty value of the items included in a solution. We propose an exact approach relying on a procedure which narrows the relevant range … Read more

Regularized Stochastic Dual Dynamic Programming for convex nonlinear optimization problems

We define a regularized variant of the Dual Dynamic Programming algorithm called REDDP (REgularized Dual Dynamic Programming) to solve nonlinear dynamic programming equations. We extend the algorithm to solve nonlinear stochastic dynamic programming equations. The corresponding algorithm, called SDDP-REG, can be seen as an extension of a regularization of the Stochastic Dual Dynamic Programming (SDDP) … Read more

Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning

We study the online estimation of the optimal policy of a Markov decision process (MDP). We propose a class of Stochastic Primal-Dual (SPD) methods which exploit the inherent minimax duality of Bellman equations. The SPD methods update a few coordinates of the value and policy estimates as a new state transition is observed. These methods … Read more

Fully Polynomial Time (Sigma,Pi)-Approximation Schemes for Continuous Nonlinear Newsvendor and Continuous Stochastic Dynamic Programs

We study the continuous newsvendor problem (i.e. a newsvendor problem concerning goods of a non-discrete nature, such as fresh fruit juice) and a class of stochastic dynamic programs with several application areas, such as inventory control of a continuous good, economics, and supply chain management. The class is characterized by continuous state and action spaces, … Read more

Relaxation Analysis for the Dynamic Knapsack Problem with Stochastic Item Sizes

We consider a version of the knapsack problem in which an item size is random and revealed only when the decision maker attempts to insert it. After every successful insertion the decision maker can dynamically choose the next item based on the remaining capacity and available items, while an unsuccessful insertion terminates the process. We … Read more

A Copositive Approach for Two-Stage Adjustable Robust Optimization with Uncertain Right-Hand Sides

We study two-stage adjustable robust linear programming in which the right-hand sides are uncertain and belong to a convex, compact uncertainty set. This problem is NP-hard, and the affine policy is a popular, tractable approximation. We prove that under standard and simple conditions, the two-stage problem can be reformulated as a copositive optimization problem, which … Read more

An Effective Dynamic Programming Algorithm for the Minimum-Cost Maximal Knapsack Packing

Given a set of n items with profits and weights and a knapsack capacity C, we study the problem of finding a maximal knapsack packing that minimizes the profit of selected items. We propose for the first time an effective dynamic programming (DP) algorithm which has O(nC) time complexity and O(n+C) space complexity. We demonstrate … Read more

Stochastic Dual Dynamic Integer Programming

Multistage stochastic integer programming (MSIP) combines the difficulty of uncertainty, dynamics, and non-convexity, and constitutes a class of extremely challenging problems. A common formulation for these problems is a dynamic programming formulation involving nested cost-to-go functions. In the linear setting, the cost-to-go functions are convex polyhedral, and decomposition algorithms, such as nested Benders’ decomposition and … Read more