In this paper, we consider sparse optimization problems with L0 norm penalty or constraint. We prove that it is strongly NP-hard to find an approximate optimal solution within certain error bound, unless P = NP. This provides a lower bound for the approximation error of any deterministic polynomial-time algorithm. Applying the complexity result to sparse linear regression reveals a gap between computational accuracy and statistical accuracy: It is intractable to approximate the estimator within constant factors of its statistical error. We also show that differentiating between the best k-sparse solution and the best (k + 1)-sparse solution is computationally hard. It suggests that tuning the sparsity level is hard.