We study non-parametric estimation of choice models, which was introduced to alleviate unreasonable assumptions in traditional parametric models, and are prevalent in several application areas. Existing literature focuses only on the static observational setting where all of the observations are given upfront, and lacks algorithms that provide explicit convergence rate guarantees or an a priori analysis for the model accuracy vs sparsity trade-off on the actual estimated model returned. As opposed to this, we focus on estimating a non-parametric choice model from observational data in a dynamic setting, where observations are obtained over time. We show that choice model estimation can be cast as a convex-concave saddle-point joint estimation and optimization problem, and we provide an online convex optimization based primal-dual framework for deriving algorithms to solve this problem. By tailoring our framework carefully to the choice model estimation problem, we obtain tractable algorithms with provable convergence guarantees and explicit bounds on the sparsity of the estimated model. Our numerical experiments confirm the effectiveness of the algorithms derived from our framework.