The cooperative energy management of aggregated buildings has recently received a great deal of interest due to substantial potential energy savings. These gains are mainly obtained in two ways: (i) Exploiting the load shifting capabilities of the cooperative buildings; (ii) Utilizing the expensive but energy efficient equipment that is commonly shared by the building community (e.g., heat pumps, batteries and photovoltaics). Several deterministic and stochastic control schemes that strive to realize these savings, have been proposed in the literature. A common difficulty with all these methods is integrating knowledge about the disturbances affecting the system. In this context, the underlying disturbance distributions are often poorly characterized based on historical data. In this paper, we address this issue by exploiting the historical data to construct families of distributions which contain these underlying distributions with high confidence. We then employ tools from data-driven robust optimization to formulate a multistage stochastic optimization problem which can be approximated by a finite-dimensional linear program. The proposed method is suitable for tackling large scale systems since its complexity grows polynomially with respect to the system variables. We demonstrate its efficacy in a numerical study, in which it is shown to outperform, in terms of energy cost savings and constraint violations, established solution techniques from the literature. We conclude this study by showing the significant energy gains that are obtained by cooperatively managing a collection of buildings with heterogeneous characteristics.