Convergence analysis on a modified generalized alternating direction method of multipliers

The alternating direction method of multipliers (ADMM) is one of the most powerful and successful methods for solving convex composite minimization problem. The generalized ADMM relaxes both the variables and the multipliers with a common relaxation factor in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$(0,2)$\end{document}(0,2), which has the potential of enhancing the performance of the classic ADMM. Very recently, two different variants of semi-proximal generalized ADMM have been proposed. They allow the weighting matrix in the proximal terms to be positive semidefinite, which makes the subproblems relatively easy to evaluate. One of the variants of semi-proximal generalized ADMMs has been analyzed theoretically, but the convergence result of the other is not known so far. This paper aims to remedy this deficiency and establish its convergence result under some mild conditions in the sense that the relaxation factor is also restricted into \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$(0,2)$\end{document}(0,2).


Introduction
Let X , Y, and Z be real finite dimensional Euclidean spaces with the inner product ·, · and its induced norm · . In this paper, we consider the following convex composite problem with coupled linear equality constraint: where f : X → (-∞, +∞] and g : Y → (-∞, +∞] are closed proper convex functions, A : X → Z and B : Y → Z are linear operators, and c ∈ Z is given. Many applications arising in various areas may have mathematical models with the form of (1), such as image processing, compressed sensing, and statistical learning. Denote A * and B * as the adjoint of A and B, respectively. Then the dual of problem (1) takes the form where f * (·) (resp. g * (·)) is a Fenchel conjugate function of f (resp. g). Under Slater's constraint qualification, it is known that (x,ȳ) is a solution to problem (1) if and only if there exists a Lagrangian multiplierλ such that the triple (x,ȳ;λ) is a solution to the following Karush-Kuhn-Tucker (KKT) conditions system: The augmented Lagrangian function associated with (1) is defined as where λ ∈ Z is a multiplier and σ > 0 is a penalty parameter. Given (x k , y k ), the classic augmented Lagrangian algorithm takes the form to derive the next pair (x k+1 , y k+1 ): (x k+1 , y k+1 ) := arg min x∈X ,y∈Y L σ (x, y; λ k ), (4a) For solving subproblem (4a), we must minimize the function with strongly coupled quadratic term, which makes it hard to solve especially in large-scale problems. By noticing the individual structure of f and g in problem (1), one effective approach is the alternating direction method of multipliers (abbreviated as ADMM) that for k = 0, 1, . . . , where τ is a step-length which can be chosen in (0, (1 + √ 5)/2). The advantage of alternating technique lies in decomposing a large problem into several smaller pieces via its favorable structure, and then solving them accordingly.
The classic ADMM algorithm was originated by Glowinski and Marroco [1], Gabay and Mercier [2] in the mid-1970s. Gabay [3] showed that the classic ADMM with the τ = 1 is a special case of the Douglas-Rachford splitting method for monotone operators in the early 1980s. Later, in [4], Eckstein and Bertsekas showed that the Douglas-Rachford splitting method is actually a special case of the proximal point algorithm. The variant of proximal ADMM was proposed by Eckstein [5], which ensures that each subproblem enjoys a unique solution by introducing an additional proximal term. This technique improves the behavior of the objective functions in the iteration subproblems and thus ameliorates the convergent property of the whole algorithm. He et al. [6] in turn showed that the proximal term can be chosen differently pre-iteration. Furthermore, Fazel et al. [7] gave a deep investigation and proved that the proximal term can be chosen to be positive semidefinite, which allows more flexible applications. One may refer to [8] for a note on the historical development of the ADMM, and some further research on ADMM can be seen in [9,10], etc.
Another contribution of Eckstein and Bertsekas [4] is the designing of a generalized ADMM based on a generalized proximal point algorithm. Very recently, combining the idea of semi-proximal terms, Xiao et al. [11] proposed a semi-proximal generalized ADMM for convex composite conic programming, and numerically illustrated that their proposed method is very promising for solving doubly nonnegative semi-positive definite programming. The method of Xiao et al. [11] relaxed all the variables with a factor of (0, 2), which has the potential of enhancing the performance of the classic ADMM. Additionally, in [11], Xiao et al. also developed another variant of semi-proximal generalized ADMM with different semi-proximal terms, but its convergence property has not been investigated so far. This paper targets to prove the global convergence of this semi-proximal generalized ADMM under some mild conditions, which may bring some theoretical foundations in some potential practical applications.
The rest of this paper is organized as follows. In Sect. 2, we present some preliminary results and review some variants of ADMMs. In Sect. 3, we establish the global convergence of the generalized semi-proximal ADMM with semi-proximal terms. In Sect. 4, we conclude this paper with some remarks.

Preliminaries
In this section, we provide some basic concepts and give a quick review of some variants of generalized ADMMs which will be used in the subsequent developments.

Basic concepts
Let E be a finite dimensional real Euclidean space with the inner product and the associated norm denoted by ·, · and · , respectively. Let f : E → (-∞, +∞] be a closed proper convex function. The effective domain of f is defined as dom f = {x ∈ E|f (x) < +∞}. The subdifferential of f is the operator defined as ∂f (x) = {x * |f (z) ≥ f (x) + x * , zx , ∀z ∈ E}, and it is simply denoted by ∂f (x). Obviously, ∂f (x) is a closed convex set while it is not empty. The point-to-set operator ∂f : x → ∂f (x) is trivially monotone, i.e., for any x, y ∈ E such that ∂f (x) and ∂f (y) are not empty, it holds that x-y, u-v ≥ x-y 2 for all u ∈ ∂f (x) and v ∈ ∂f (y), where : E → E is a self-adjoint positive semidefinite linear operator. The Fenchel conjugate of a function f at y ∈ E is defined as It is well known in [12] that the conjugate function f * (y) is always convex and closed, proper if and only if f is proper. Furthermore, (cl f ) * = f * and f * * = cl f , where cl f denotes the closed function of f , i.e., the epigraph of cl f is a closure of the epigraph of the convex function f .
Assuming that the KKT system (3) is not empty, then the dual problem (2) can be solved by using the splitting method to solve the following inclusion problem: with It is easy to see that both T 1 and T 2 are maximal monotone operators. To solve (6), an equivalent form of the generalized proximal point algorithm of Eckstein and Bertsekas [4] with any initial point v 0 is that where ρ ∈ (0, 2) and J σ T = (I + σ T) -1 is the so-called resolvent operator.

Proximal ADMM
In order to broaden the capability of the classic ADMM, Eckstein [5] added a proximal term to each subproblem, which reduced to where S and T are positive definite matrices. Moreover, Fazel et al. [7] further illustrated that both weighting matrices can be chosen as positive semidefinite so that it can be applied in more practical situations. For more details on its convergence results, one can refer to [7] and the references therein.
It can be observed that the subproblems in the generalized ADMM schemes (10a)-(10d) may not admit solutions because A or B is not assumed to be row full-rank. One natural way to fix this problem is to add proximal terms to these subproblems. Very recently, Xiao et al. [11] suggested a couple of approaches to achieve this purpose. One of them is to add the semi-proximal terms 1 2 xx k-1 2 S and 1 2 yy k-1 2 T to the subproblems for computing x k and y k , i.e., Another one is to add the proximal terms 1 2 x -x k S and 1 Compared with the traditional proximal approach (12a)-(12d), the semi-proximal terms in (13a)-(13d) are more natural in the sense that the most recently updated values of variables are involved. Actually, the global convergence of the iterative framework (13a)-(13d) has been analyzed in [11], and the corresponding numerical results illustrated that the proposed method can solve these problems not only effectively but also efficiently. In this paper, we particularly concentrate on the convergence analysis of the corresponding algorithm based on the former iterative framework to solve the separable convex minimization problem (1).

Global convergence
This section is devoted to analyzing the global convergence of the generalized ADMM based on the iterative framework (12a)-(12d) Since f and g are both closed proper convex functions, it is known that ∂f and ∂g are maximal monotone mappings [14], and then there exist a couple of self-adjoint positive semidefinite linear operators f : X → X and First, we state the detailed steps of the generalized ADMM with semi-proximal terms (abbreviate it as sPGADM) as follows.
Assumption A There exists at least one vector (x,ȳ;λ) ∈ X × Y × Z such that the KKT system (3) is satisfied.
We now let {(x k , y k ; λ k )} be the sequence generated by sPGADM and (x,ȳ;λ) be a solution of the KKT system (3). For a more convenient discussion, we denote x e k = x k -x, y e k = y k -ȳ, and λ e k = λ k -λ. The first-order optimality condition of (15a) can be expressed as which combined with (15b) yields and Sincex andλ satisfy the KKT system (3), then we obtain from (14) that Similarly, the first-order optimality condition of (15c) can also be described as Thus, from the monotone property (14), we get or, equivalently, λ e k , By e kσ (Ax k + By kc), By e k -T(y ky k-1 ), y e k ≥ y e k 2 g .
Adding two sides of (19) and (21) implies Note that the first term on the left-hand side of (22) can be reorganized as The following two lemmas play a fundamental role in our convergence analysis. Thus, By using (26), we get Together with the basic relation it implies that 2 λ e k+1 + σ (1ρ)Ax e k+1 , σρ(Ax k+1 + By kc) which is equivalent to Since {x,ȳ;λ} satisfies the KKT system (3) that Ax + Bȳ = c, Ax k+1 + By kc = Ax e k+1 + By e k , then from (28) we get (25).
Because ρ ∈ (0, 2), then (35)   The boundedness of the sequence {(x k , y k ; λ k )} implies that there exists at least one convergent subsequence; for simplicity we denote it as By using (17) and (20), we obtain Because f and g are closed proper convex functions, the nonempty sets ∂f and ∂g are closed. By noticing that Ax k + By kc 2 ≤ Ax e k+1 + By e k 2 + Ax e k -Ax e k+1 2 , and as mentioned before, Ax e k+1 + By e k → 0, Ax e k -Ax e k+1 → 0, x k+1x k S → 0, y ky k-1 T → 0, we take limits with k i on both sides of (41) and (42). It implies that (x ∞ , y ∞ ; λ ∞ ) satisfies the KKT condition: To complete the whole proof, now we will show that (x ∞ , y ∞ ; λ ∞ ) is the unique limit of the sequence {(x k , y k ; λ k )}. In fact, since (x ∞ , y ∞ ; λ ∞ ) satisfies the KKT condition, without loss of generality, we can let (x,ȳ;λ) = (x ∞ , y ∞ ; λ ∞ ). Thus, from the definition of φ k in (35), there exists a subsequence {(x k i , y k i ; λ k i )} such that Together with the nonincreasing and boundedness of {φ k }, we know that {φ k } converges to zero itself. By (35), it turns out that when k → ∞, λ e k + σ (1ρ)Ax e k → 0, x e k S → 0, y e k-1 T → 0, Thus, lim k→∞ λ k =λ since 0 ≤ λ e k ≤ λ e k + σ (1ρ)Ax e k + σ |1 -ρ| Ax e k . Noticing that Thus, from f + S + A * A 0 and x e k = x k -x, it comes true that lim k→∞ x k =x.