We consider the linearly constrained convex minimization model with a separable objective function which is the sum of m functions without coupled variables, and discuss how to design an efficient algorithm based on the fundamental technique of splitting the augmented Lagrangian method (ALM). Our focus is the specific big-data scenario where m is huge. A pretreatment on the original data is to regroup the m functions in the objective and the corresponding m variables as t subgroups, where $t$ is a handleable number (usually t>=3 but much smaller than m). To tackle the regrouped model with t blocks of functions and variables, some existing splitting methods in the literature are applicable. We concentrate on the application of the alternating direction method of multiplier with Gaussian back substitution (ADMM-GBS) whose efficiency and scalability have been well verified in the literature. The block-wise ADMM-GBS is thus resulted and named when the ADMM-GBS is applied to solve the t-block regrouped model. To alleviate the difficulty of the resulting ADMM-GBS subproblems, each of which may still require minimizing more than one function with coupled variables, we suggest further decomposing these subproblems but proximally regularizing these further decomposed subproblems to ensure the convergence. With this further decomposition, each of the resulting subproblems only requires handling one function in the original objective plus a simple quadratic term; it thus may be very easy for many concrete applications where the functions in the objective have some specific properties. Moreover, these further decomposed subproblems can be solved in parallel, making it possible to handle big-data by highly capable computing infrastructures. Consequently, a splitting version of the block-wise ADMM-GBS, is proposed for the particular big-data scenario. The implementation of this new algorithm is suitable for a centralized-distributed computing system, where the decomposed subproblems of each block can be computed in parallel by a distributed-computing infrastructure and the blocks are updated by a centralized-computing station. For the new algorithm, we prove its convergence and establish its worst-case convergence rate measured by the iteration complexity. Two refined versions of this new algorithm with iteratively calculated step sizes and linearized subproblems are also proposed, respectively.