We consider the proximal gradient algorithm for solving penalized least-squares minimization problems arising in data science. This first-order algorithm is attractive due to its flexibility and minimal memory requirements allowing to tackle large-scale minimization problems involving non-smooth penalties. However, for problems such as X-ray computed tomography, the applicability of the algorithm is dominated by the cost of applying the forward linear operator and its adjoint at each iteration. In practice, the adjoint operator is thus often replaced by an alternative operator with the aim to reduce the overall computation burden and potentially improve conditioning issues. In this paper, we propose to analyze the effect of such an adjoint mismatch on the convergence of the proximal gradient algorithm in an infinite-dimensional setting, thus generalizing the existing results on PGA. We derive conditions on the step-size and on the gradient of the smooth part of the objective function under which convergence of the algorithm to a fixed point is guaranteed. We also derive bounds on the error between this point and the solution to the original minimization problem. We illustrate our theoretical findings with two image reconstruction tasks in computed tomography.
Citation
Technical report - October 2020