We consider the task of classification in the high-dimensional setting where the number of features of the given data is significantly greater than the number of observations. To accomplish this task, we propose sparse zero-variance discriminant analysis (SZVD) as a method for simultaneouslyperforming linear discriminant analysis and feature selection on high-dimensional data. This method combines classical zero-variance discriminant analysis, where discriminant vectors are identified in the null space of the sample within-class covariance matrix, with penalization applied to the discriminant vectors to induce sparse solutions. We propose a simple algorithm based on the alternating direction method of multipliers for approximately solving the resulting nonconvex optimization problem. Further, we show that this algorithm is applicable to a larger class of penalized generalized eigenvalue problems, including a particular relaxation of the sparse principal component analysis problem. Theoretical guarantees for convergence of our algorithm to stationary points of the original nonconvex problem and the results of numerical experiments evaluating performance of our classification heuristic on simulated data and data drawn from applications in time-series classification are also provided.