This paper is concerned with a linear control policy for dynamic portfolio selection. We develop this policy by incorporating time-series behaviors of asset returns on the basis of coherent risk minimization. Analyzing the dual form of our optimization model, we demonstrate that the investment performance of linear control policies is directly connected to the intertemporal covariance of asset returns. To mitigate overfitting to training data (i.e., historical asset returns), we apply robust optimization. For this optimization, we prove that the worst-case coherent risk measure can be decomposed into the empirical risk measure and the penalty terms. Numerical results demonstrate that when the number of assets is small, linear control policies deliver good out-of-sample investment performance. When the number of assets is large, the penalty terms improve the out-of-sample investment performance.