Lower Bound On the Computational Complexity of Discounted Markov Decision Problems

We study the computational complexity of the infinite-horizon discounted-reward Markov Decision Problem (MDP) with a finite state space $\cS$ and a finite action space $\cA$. We show that any randomized algorithm needs a running time at least $\Omega(\carS^2\carA)$ to compute an $\epsilon$-optimal policy with high probability. We consider two variants of the MDP where the … Read more

Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning

We study the online estimation of the optimal policy of a Markov decision process (MDP). We propose a class of Stochastic Primal-Dual (SPD) methods which exploit the inherent minimax duality of Bellman equations. The SPD methods update a few coordinates of the value and policy estimates as a new state transition is observed. These methods … Read more

Worst-Case Hardness of Approximation for Sparse Optimization with L0 Norm

In this paper, we consider sparse optimization problems with L0 norm penalty or constraint. We prove that it is strongly NP-hard to find an approximate optimal solution within certain error bound, unless P = NP. This provides a lower bound for the approximation error of any deterministic polynomial-time algorithm. Applying the complexity result to sparse … Read more

Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints

Consider convex optimization problems subject to a large number of constraints. We focus on stochastic problems in which the objective takes the form of expected values and the feasible set is the intersection of a large number of convex sets. We propose a class of algorithms that perform both stochastic gradient descent and random feasibility … Read more