We integrate machine learning with distributionally robust optimization to address a two-period problem for the joint pricing and production of multiple items. First, we generalize the additive demand model to capture both cross-product and cross-period effects as well as the demand dependence across periods. Next, we apply K-means clustering to the demand residual mapping based on historical data and then construct a K-means ambiguity set on that residual while specifying only the mean, the support, and the mean absolute deviation. Finally, we investigate the joint pricing and production problem by proposing a K-means adaptive markdown policy and an affine recourse approximation; the latter allows us to reformulate the problem as an approximate but more tractable mixed-integer linear programming problem. Both the case study and our simulation demonstrate that, with only a few clusters, the K-means adaptive markdown policy and ambiguity set can increase expected profits by 1.12% on average and by as much as 2.22%---as compared with the empirical model---when applied to most out-of-sample tests.