When binary linear error-correcting codes are used over symmetric channels, a relaxed version of the maximum likelihood decoding problem can be stated as a linear program (LP). This LP decoder can be used to decode at bit-error-rates comparable to state-of-the-art belief propagation (BP) decoders, but with significantly stronger theoretical guarantees. However, LP decoding when implemented with standard LP solvers does not easily scale to the block lengths of modern error correcting codes. In this paper we draw on decomposition methods from optimization theory, specifically the Alternating Directions Method of Multipliers (ADMM), to develop efficient distributed algorithms for LP decoding. The key enabling technical result is a nearly linear time algorithm for two-norm projection onto the parity polytope. This allows us to use LP decoding, with all its theoretical guarantees, to decode large-scale error correcting codes efficiently. We present numerical results for two LDPC codes. The first is the rate-0.5 [2640,1320] ``Margulis'' code, the second a rate-0.77 [1057.244] code. The ``waterfall'' region of LP decoding is seen to initiate at a slightly higher signal-to-noise ratio than for sum-product BP, however an error-floor is not observed for either code, which is not the case for BP. Our implementation of LP decoding using ADMM executes as quickly as our baseline sum-product BP decoder, is fully parallelizable, and can be seen to implement a type of message-passing with a particularly simple schedule.