With the recent boom of the gig economy, urban delivery systems have experienced substantial demand growth. In such systems, orders are delivered to customers from local distribution points respecting a delivery time promise. An important example is a restaurant meal delivery system, where delivery times are expected to be minutes after an order is placed. The system serves orders by making use of couriers that continuously perform pickups and deliveries. Operating such a rapid delivery system is very challenging, primarily due to the high service expectations and the considerable uncertainty in both demand and delivery capacity. Delivery providers typically plan courier shifts for an operating period based on a demand forecast. However, because of the high demand volatility it may at times during the operating period be necessary to adjust and dynamically add couriers. We study the problem of dynamically adding courier capacity in a rapid delivery system and propose a deep reinforcement learning approach to obtain a policy that balances the cost of adding couriers and the cost of service quality degradation because of insufficient delivery capacity. Specifically, we seek to ensure that a high fraction of orders is delivered on time with a small number of courier hours. A computational study in the meal delivery space shows that a learned policy outperforms policies representing current practice and demonstrates the potential of deep learning for solving operational problems in highly stochastic logistic settings.
View Dynamic courier capacity acquisition in rapid delivery systems: a deep Q-learning approach