Finite-Sample Optimality and Constraint Satisfaction: Learning-Based Optimal Control in Dynamic Dispatch Networks
Dynamic dispatch networks in logistics and transportation require real-time, constraint-aware decision-making under stochastic demand. This paper bridges mathematical optimization, optimal control theory, and reinforcement learning by establishing non-asymptotic theoretical guarantees for learning-based optimal control in constrained stochastic dispatch systems. We formulate the problem as a constrained Markov decision process, enforce feasibility via a projection-based policy … Read more