Stochastic programming models are large-scale optimization problems that are used to facilitate decision-making under uncertainty. Optimization algorithms for such problems need to evaluate the expected future costs of current decisions, often referred to as the recourse function. In practice, this calculation is computationally difficult as it requires the evaluation of a multidimensional integral whose integrand is an optimization problem. In turn, the recourse function has to be estimated using scenario trees or Monte Carlo methods. Unfortunately, scenario trees do not scale well for problems with many dimensions of uncertainty and many stages, and Monte Carlo methods require very large numbers of samples. We introduce a novel importance sampling framework for multistage stochastic programming that can produce accurate estimates of the recourse function using a fixed number of samples. Previous approaches for importance sampling in stochastic programming were limited to problems where the uncertainty was modeled using discrete random variables, and the recourse function was additively separable in the uncertain dimensions. Our framework avoids these restrictions by using Markov Chain Monte Carlo and Kernel Density Estimation algorithms to create a non-parametric importance sampling distribution that can form lower variance estimates of the recourse function. We demonstrate the increased accuracy and efficiency of our approach in the context of multistage stochastic programming using variants of the Newsvendor problem. Our numerical results show that our framework produces more accurate estimates of the optimal value and solution of stochastic programming models, especially for problems with moderate to high variance, multimodal or rare-event distributions.