This paper focuses on algorithms for multi-stage stochastic linear programming (MSLP). We propose an ensemble method named the ``compromise policy'', which not only reduces the variance of the function approximation but also reduces the bias of the estimated optimal value. It provides a tight lower bound estimate with a confidence interval. By exploiting parallel computing, the compromise policy provides demonstrable advantages in performance and stability with marginally extra computational time. We further propose a meta-algorithm to solve the MSLP problems based on the in-sample and out-of-sample optimality tests. Our meta-algorithm is incorporated within an SDDP-type algorithm for MSLP and significantly improves the reliability of the decisions suggested by SDDP. These advantages are demonstrated via extensive computations, which show the effectiveness of our approach.
View Compromise Policy for Multi-stage Stochastic Linear Programming: Variance and Bias Reduction