- Research
- Open access
- Published:
Inertial proximal alternating minimization for nonconvex and nonsmooth problems
Journal of Inequalities and Applications volume 2017, Article number: 232 (2017)
Abstract
In this paper, we study the minimization problem of the type \(L(x,y)=f(x)+R(x,y)+g(y)\), where f and g are both nonconvex nonsmooth functions, and R is a smooth function we can choose. We present a proximal alternating minimization algorithm with inertial effect. We obtain the convergence by constructing a key function H that guarantees a sufficient decrease property of the iterates. In fact, we prove that if H satisfies the Kurdyka-Lojasiewicz inequality, then every bounded sequence generated by the algorithm converges strongly to a critical point of L.
1 Introduction
Nonconvex and nonsmooth optimization problems are extremely useful in many applied sciences, including statistics, machine learning, regression, classification, and so on. One of the most practical and classical optimization problems is of the form
In this paper, we study the problem in the nonconvex and nonsmooth setting, where \(f, g: \mathbb {R}^{n}\to(-\infty,\infty]\) are proper lower semicontinuous functions. We aim at finding the critical points of
(with R being smooth) and possibly solving the corresponding minimization problem (1). This can be seen by setting
where \(\rho>0\) is a relaxation parameter.
For problem (2), we introduce a proximal alternating minimization algorithm with inertial effect and investigate the convergence of the generated iterates. Inertial proximal methods go back to [1, 2], where it has been noticed that the discretization of a differential system of second order in time gives rise to a generalization of the classical proximal-point algorithm. The main feature of the inertial proximal algorithm is that the next iterate is defined by using the last two iterates. Recently, there has been an increasing interest in algorithms with inertial effect; see [3–12].
Generally, we consider the problem
with \(x\in\mathbb {R}^{n}\) and \(y\in\mathbb {R}^{m}\).
In [13], the authors proposed the alternating minimization algorithm
which can be viewed as a proximal regularization of a two-block Gauss-Seidel method for minimizing L,
Inspired by [13], we propose the algorithm
We need the following assumptions on the functions and parameters.
- (H1):
-
\(f: \mathbb {R}^{n}\to(-\infty,\infty]\) and \(g: \mathbb {R}^{m}\to (-\infty,\infty]\) are proper lower semicontinuous functions;
- (H2):
-
\(R: \mathbb {R}^{n}\times\mathbb {R}^{m}\to\mathbb {R}\) is a continuously differentiable function;
- (H3):
-
∇R is Lipschitz continuous on bounded subsets of \(\mathbb {R}^{n}\times\mathbb {R}^{m}\);
- (H4):
-
\(\inf L>-\infty\);
- (H5):
-
\(0<\mu_{-}\leq\mu_{k}\leq\mu_{+}, 0<\lambda_{-}\leq\lambda_{k}\leq \lambda_{+}, 0\leq\alpha_{k}\leq\alpha, 0\leq\beta_{k}\leq\beta\);
- (H6):
-
\(\sigma>\max\{\alpha,\beta\}\cdot\max\{\lambda_{+},\mu_{+}\} \cdot(\sigma^{2}+1)\).
To prove the convergence of the algorithm under these assumptions, we construct a key function H, which is defined as in (11). Based on H, we can obtain a sufficient decrease property of the iterates, the existence of a subgradient lower bound for the iterate gap, and some important analytic features of the objective function. Finally, we can prove that every bounded sequence generated by the algorithm converges to a critical point of L if H satisfies the Kurdyka-Lojasiewicz inequality.
The rest of the paper is arranged as follows. In Section 2, we recall some elementary notions and facts of nonsmooth nonconvex analysis. In Section 3, we present a detailed proof of the convergence of the algorithm. In Section 4, we give a brief conclusion.
2 Preliminaries
In this section, we recall some definitions and results. Let \(\mathbb {N}\) be the set of nonnegative integers. For \(m\geq1\), the Euclidean scalar product and induced norm on \(\mathbb {R}^{m}\) are denoted by \(\langle\cdot ,\cdot\rangle\) and \(\Vert \cdot \Vert \), respectively.
The domain of a function \(f:\mathbb {R}^{m}\to(-\infty,\infty]\) is defined by \(\operatorname{dom} f=\{x\in\mathbb {R}^{m}: f(x)<\infty\}\). We say that f is proper if \(\operatorname{dom} f\neq\emptyset\). For the following generalized subdifferential notions and their basic properties we refer to [14] and [15]. Let \(f:\mathbb {R}^{m}\to(-\infty,\infty]\) be a proper lower semicontinuous function. For \(x\in\operatorname{dom} f\), we consider the Frechet (viscosity) subdifferential of f at x defined by the set
For \(x\notin\operatorname{dom} f\), we set \(\hat{\partial}f(x):=\emptyset\). The limiting (Mordukhovich) subdifferential of f at \(x\in \operatorname{dom} f\) is defined by
whereas for \(x\notin\operatorname{dom} f\), we take \(\partial f(x):=\emptyset\).
It is known that both notions of subdifferentials coincide with the convex subdifferential if f is convex, that is, \(\hat{\partial}f(x)=\partial f(x)=\{v\in\mathbb {R}^{m}:f(y)\geq f(x)+\langle v,y-x\rangle, \forall y\in\mathbb {R}^{m}\}\). Notice that if f is continuously differentiable around \(x\in\mathbb {R}^{m}\), then we have \(\partial f(x)=\{\nabla f(x)\}\). Generally, the inclusion \(\hat{\partial}f(x)\subset\partial f(x)\) holds for each \(x\in\mathbb {R}^{m}\).
The Fermat rule reads in this nonsmooth setting as follows: if \(x\in\mathbb {R}^{m}\) is a local minimizer of f, then
Denote by
the set of (limiting) critical points of f. Let us mention also the following subdifferential rule: if \(f: \mathbb {R}^{m}\to(-\infty ,\infty]\) is proper lower semicontinuous and \(g: \mathbb {R}^{m}\to \mathbb {R}\) is a continuously differentiable function, then
for all \(x\in\mathbb {R}^{m}\).
We also denote \(\operatorname{dist}(x,\Omega)=\inf_{y\in\Omega} \Vert x-y \Vert \) for \(x\in\mathbb {R}^{m}\) and \(\Omega\subset\mathbb {R}^{m}\).
Now let us recall the Kurdyka-Lojasiexicz property, which plays an important role in the proof of the convergence of our algorithm.
Definition 2.1
Kurdyka-Lojasiexicz property; see [13, 16]
Let \(f: \mathbb {R}^{m}\to(-\infty,\infty]\) be a proper lower semicontinuous function. We say that f satisfies the Kurdyka-Lojasiexicz (KL) property at \(\bar{x}\in\operatorname{dom} \partial f=\{ x\in\mathbb {R}^{m}: \partial f(x)\neq\emptyset\}\) if there exist \(\eta >0\), a neighborhood U of x̄, and a continuous concave function \(\varphi: [0,\eta)\to[0,\infty)\) such that
-
(i)
\(\varphi(0)=0\);
-
(ii)
φ is continuously differentiable on \((0,\eta)\) and continuous at 0;
-
(iii)
\(\varphi'(s)>0\) for all \(s\in(0,\eta)\);
-
(iv)
for all \(x\in U\cap\{x\in\mathbb {R}^{m}: f(\bar{x})< f(x)< f(\bar{x})+\eta\}\), we have the KL inequality:
$$ \varphi' \bigl(f(x)-f(\bar{x}) \bigr) \operatorname{dist} \bigl(0, \partial f(x) \bigr)\geq1. $$(7)If f satisfies the KL property at each point in \(\operatorname{dom} \partial f\), then we call f a KL function.
It is worth mentioning that many functions in applied science are the KL functions (see [16]). In fact, semialgebraic functions, real subanalytic functions, semiconvex functions, and uniformly convex functions are all KL functions.
The following result (see [16], Lemma 6) is crucial to our convergence analysis.
Lemma 2.1
Let \(\Omega\subset\mathbb {R}^{m}\) be a compact set, and let \(f: \mathbb {R}^{m}\to(-\infty,\infty]\) be a proper lower semicontinuous function. Assume that f is constant on Ω and f satisfies the KL property at each point of Ω. Then there exist \(\epsilon>0, \eta>0\), and a continuous concave function φ such that
-
(i)
\(\varphi(0)=0\);
-
(ii)
φ is continuously differentiable on \((0,\eta)\) and continuous at 0;
-
(iii)
\(\varphi'(s)>0\) for all \(s\in(0,\eta)\);
-
(iv)
for all \(\bar{x}\in\Omega\) and all \(x\in\{x\in\mathbb {R}^{m}: \operatorname{dist}(x,\Omega)<\epsilon\}\cap\{x\in\mathbb {R}^{m}: f(\bar{x})<f(x)<f(\bar{x})+\eta\}\), we have the KL inequality:
$$ \varphi' \bigl(f(x)-f(\bar{x}) \bigr) \operatorname{dist} \bigl(0, \partial f(x) \bigr)\geq1. $$(8)
We need the following two lemmas. The first one was often used in the context of Fejer monotonicity techniques for proving convergence results of classical algorithms for convex optimization problems or, more generally, for monotone inclusion problems (see [17]). The second one is easy to verify (see [12]).
Lemma 2.2
Let \(\{a_{n}\}_{n\in\mathbb {N}}\) and \(\{b_{n}\}_{n\in\mathbb {N}}\) be real sequences such that \(b_{n}\geq0\) for all \(n\in\mathbb {N}\), \(\{a_{n}\} _{n\in\mathbb {N}}\) is bounded below, and \(a_{n+1}+b_{n}\leq a_{n}\) for all \(n\in\mathbb {N}\). Then \(\{a_{n}\}_{n\in\mathbb {N}}\) is a monotonically decreasing and convergent sequence, and \(\sum_{n\in\mathbb {N}}b_{n}<+\infty\).
Lemma 2.3
Let \(\{a_{n}\}_{n\in\mathbb {N}}\) and \(\{b_{n}\}_{n\in\mathbb {N}}\) be nonnegative real sequences such that \(\sum_{n\in\mathbb {N}} b_{n}<\infty\) and \(a_{n+1}\leq a\cdot a_{n}+b\cdot a_{n-1}+b_{n}\) for all \(n\geq1\), where \(a\in\mathbb {R}, b\geq0\), and \(a+b<1\). Then \(\sum_{n\in\mathbb {N}} a_{n}<\infty\).
3 The convergence of the algorithm
In this section, we prove the convergence of our algorithm. Motivated by [11] and [13], we divide the proof into three main steps, which are listed in the following three subsections, respectively.
We always use \(\{(x_{k},y_{k})\}\) as the sequence generated by (3)-(4).
3.1 A sufficient decrease property of the iterates
In this subsection, we construct the key function H and prove that the iterates have a sufficient decrease property.
Lemma 3.1
Under assumptions (H1)-(H6), the sequence \(\{(x_{k},y_{k})\}\) is well defined, and \(\{L(x_{k},y_{k})\}\) is decreasing. More precisely, there exist two positive constants \(m_{2}>m_{1}>0\) such that
Proof
Assumption (H4) indicates that, for any \(r_{1}, r_{2}>0\) and \((\bar{x},\bar{y}), (\hat{x},\hat{y})\in\mathbb {R}^{n}\times \mathbb {R}^{m}\), the functions
and
are coercive. Thus the sequence \(\{(x_{k},y_{k})\}\) is well defined.
Now we prove (9). Using the definition of \(x_{k+1}\) and \(y_{k+1}\) in (3) and (4), we have
This leads to
for any \(\sigma>0\). Thus it yields
Clearly assumption (H5) implies that
and
Thus
Set \(m_{1}=\max\{M_{1}, M_{2}\}, m_{2}=\min\{M_{3},M_{4}\}\). Then
An elementary verification shows that \(m_{2}>m_{1}>0\) under assumption (H6). □
Remark 3.1
Based on Lemma 3.1, we can define the new function
where \(z=(x,y), w=(u,v)\), and \(\Vert z-w \Vert ^{2}= \Vert x-u \Vert ^{2}+ \Vert y-v \Vert ^{2}\). Set \(z_{k}=(x_{k},y_{k})\). Then Lemma 3.1 implies that the sequence \(\{H(z_{k+1},z_{k})\}\) is decreasing. The decrease property of the iterates \(\{x_{k},y_{k}\}\) showed in Lemma 3.2 is of vital importance for the convergence proof. Thus, we call \(H(x,y)\) the key function.
More precisely, we have the following lemma.
Lemma 3.2
Let \(H(z,w)\) be defined as in (11). Then under assumptions (H1)-(H6), we have
where \(z_{k}=(x_{k},y_{k})\), that is, the sequence \(\{H(z_{k+1},z_{k})\}\) is decreasing.
Proof
Set \(m:=m_{2}-m_{1}>0\). Then the result follows directly from (9) or (10). □
3.2 Norm estimate of the subdifferential of H
In this subsection, we prove that there exists a subgradient lower bound for the iterate gap. First, we estimate the norm of the subdifferential of L.
Lemma 3.3
Define
Then, under assumptions (H1)-(H6), \((p_{k+1},q_{k+1})\in\partial L(x_{k+1},y_{k+1})\). Moreover, if \(\{(x_{k},y_{k})\}\) is bounded, then there exists a positive constant \(C_{1}>0\) such that
Proof
According to the definition of \(x_{k+1}\) and \(y_{k+1}\) and the Fermat rule, we get
Thus
and
Using assumption (H3), we obtain that
and
where â„“ is the Lipschitz constant of \(\nabla R(x,y)\) on the bounded set \(\{(x_{k},y_{k})\}\) .
Hence the norm estimate can be immediately derived. □
The norm estimate of the subdifferential of H is a direct consequence of Lemma 3.3.
Lemma 3.4
For all \(k\in\mathbb {N}\), \(H(z,w)\) has a subdifferential at \((z_{k+1},z_{k})\) of the form
Moreover, there exists a positive constant \(C_{2}>0\) such that
Proof
According to the definition of \(H(z,w)\), we get
The rest is immediately obtained. □
The norm estimate, together with the closeness of the limiting subdifferential, is used to obtain the following convergence of the subsequence of \(\{x_{k},y_{k}\}\).
Lemma 3.5
Preconvergence result
Under assumptions (H1)-(H6), we have the following statements:
-
(i)
\(\sum_{k=1}^{\infty} \Vert z_{k+1}-z_{k} \Vert ^{2}<\infty\); particularly, \(\Vert x_{k+1}-x_{k} \Vert \to0, \Vert y_{k+1}-y_{k} \Vert \to0, k\to\infty\);
-
(ii)
the sequence \(\{L(x_{k},y_{k})\}\) is convergent;
-
(iii)
the sequence \(\{H(z_{k+1},z_{k})\}\) is convergent;
-
(iv)
if \(\{(x_{k},y_{k})\}\) has a cluster point \((x^{*},y^{*})\), then \((x^{*},y^{*})\in\operatorname{crit} L\).
Proof
Set \(a_{k}:=L(x_{k},y_{k})+m_{1}( \Vert x_{k}-x_{k-1} \Vert ^{2}+ \Vert y_{k}-y_{k-1} \Vert ^{2})\) and \(b_{k}=(m_{2}-m_{1})( \Vert x_{k+1}-x_{k} \Vert ^{2}+ \Vert y_{k+1}-y_{k} \Vert ^{2})\). Then Lemma 3.2 gives \(a_{k+1}+b_{k}\leq a_{k}\). Then assumption (H4) ensures that \(a_{n}\) is bounded below. Thus Lemma 2.2 implies (i) and (ii). Moreover, the definition of \(H(z,w)\) yields that
Thus (iii) is derived from (i) and (ii).
Now let \(\{(x_{k_{j}},y_{k_{j}})\}\) be a subsequence of \(\{(x_{k},y_{k})\}\) such that \(\{(x_{k_{j}},y_{k_{j}})\}\to(x^{*},y^{*}), j\to\infty\). Since f is lower semicontinuous, we have
On the other hand, the definition of \(x_{k+1}\) shows that
from which we get
Hence
where we have used assumption (H5) and replaced \(x_{k}, y_{k}\) by \(x_{k_{j}}, y_{k_{j}}\).
Due to the fact that \(\Vert x_{k+1}-x_{k} \Vert \to0\) from (i), we have \(\Vert x_{k_{j}+1}-x_{k_{j}} \Vert \to0\). This, together with \(\Vert x_{k_{j}}-x^{*} \Vert \to 0\), yields \(\Vert x_{k_{j}+1}-x^{*} \Vert \to0\). Using the continuity of \(R(x,y)\) by assumption (H2), the last inequality yields
Therefore
In a similar way, we can prove that \(\lim_{j\to\infty}g(y_{k_{j}})=g(y^{*})\). Combining with the continuity of \(R(x,y)\), we immediately obtain that
On the other hand, Lemma 3.5(i) and Lemma 3.3 give \((p_{k_{j}+1},q_{k_{j}+1})\in\partial L(x_{k_{j}+1},y_{k_{j}+1})\) and \((p_{k_{j}+1},q_{k_{j}+1})\to0, j\to\infty\). Thus the closeness of the limiting subdifferential (see (6)) indicates that \(0\in \partial L(x^{*},y^{*})\). □
3.3 Analytic property of the key function H
Denote by Ω the set of the cluster points of the sequence \(\{ (z_{k+1},z_{k})\}\).
Lemma 3.6
Suppose that the sequence \(\{(x_{k},y_{k})\}\) is bounded. Under assumptions (H1)-(H6), we have that
-
(i)
Ω is nonempty, compact, and connected. Moreover, \(\operatorname{dist}((z_{k+1},z_{k}),\Omega)\to0\) as \(k\to\infty\).
-
(ii)
\(\Omega\subset\operatorname{crit} H=\{(z^{*},z^{*}):z^{*}=(x^{*},y^{*})\in\operatorname{crit} L\}\).
-
(iii)
H is finite and constant on Ω.
Proof
(i) It is easy to check by some elementary tools (see, e.g., [16]).
(ii) According to Lemma 3.5(i), \(\sum_{k=1}^{\infty} \Vert z_{k+1}-z_{k} \Vert ^{2}<\infty\), and hence \(\Vert z_{k+1}-z_{k} \Vert \to0, k\to \infty\). Note that \(z_{k}=(x_{k},y_{k})\), so \(\Omega\subset\{ (z^{*},z^{*}):z^{*}=(x^{*},y^{*})\in\operatorname{crit} L\}\). On the other hand, from (15) we see that
Thus \(\operatorname{crit} H=\{(z^{*},z^{*}):z^{*}=(x^{*},y^{*})\in\operatorname{crit} L\}\), and hence \(\Omega\subset\operatorname{crit} H\).
(iii) Notice that \(\{L(x_{k},y_{k})\}\) is convergent by Lemma 3.5(ii). Let \(L^{*}=\lim_{k\to\infty}L(x_{k},y_{k})\) be a constant. For any \((z^{*},z^{*})\in\Omega\), we have \(z^{*}=(x^{*},y^{*})\in\operatorname{crit} L\), and there exits a subsequence \(\{(x_{k_{j}},y_{k_{j}})\}\) of \(\{(x_{k},y_{k})\}\) such that \(\{(x_{k_{j}},y_{k_{j}})\}\to(x^{*},y^{*})\). So
Thus
 □
Theorem 3.1
Convergence
Assume that \(H(z,w)\) is a KL function and that the sequence \(\{ (x_{k},y_{k})\}\) is bounded. Then, under assumptions (H1)-(H6), we have
-
(i)
\(\sum_{k=1}^{\infty} \Vert z_{k}-z_{k-1} \Vert <\infty\), that is, \(\sum_{k=1}^{\infty}( \Vert x_{k}-x_{k-1} \Vert + \Vert y_{k}-y_{k-1} \Vert )<\infty\);
-
(ii)
\(\{(x_{k},y_{k})\}\) converges to a critical point \((x^{*},y^{*})\) of \(L(x,y)\).
Proof
According to Lemma 3.6, we consider an element \((x^{*},y^{*})\in\operatorname{crit} L(x,y)\) such that \((z^{*},z^{*})\in \Omega\), where \(z^{*}=(x^{*},y^{*})\). From the previous proof we can easily obtain that \(\lim_{k\to\infty }H(z_{k+1},z_{k})=H(z^{*},z^{*})\). Next, we prove the theorem in two cases.
Case 1. There exists a positive integer \(k_{0}\) such that \(H(z_{k_{0}+1},z_{k_{0}})=H(z^{*},z^{*})\) .
Since \(\{H(z_{k+1},z_{k})\}\) is decreasing, we know that \(H(z_{k+1},z_{k})=H(z^{*},z^{*})\) for all \(k\geq k_{0}\). This, together with the definition of H, shows that \(z_{k}=z_{k_{0}}\) for all \(k\geq k_{0}\), and the desired results follow.
Case 2. \(H(z_{k+1},z_{k})>H(z^{*},z^{*})\) for all \(k\in\mathbb {N}\) .
Since H satisfies the KL property, Lemma 2.1 says that there exist \(\epsilon, \eta>0\) and a concave function φ such that
-
(i)
\(\varphi(0)=0\);
-
(ii)
φ is continuously differentiable on \((0,\eta)\) and continuous at 0;
-
(iii)
\(\varphi'(s)>0\) for all \(s\in(0,\eta)\);
-
(iv)
for all
$$\begin{aligned} (z,w)\in{}& \bigl\{ (z,w)\in\mathbb {R}^{n}\times\mathbb {R}^{m}: \operatorname{dist} \bigl((z,w),\Omega \bigr)< \epsilon \bigr\} \\ &{}\cap \bigl\{ (z,w)\in\mathbb {R}^{n}\times\mathbb {R}^{m}: H \bigl(z^{*},z^{*} \bigr)< H(z,w)< H \bigl(z^{*},z^{*} \bigr)+\eta \bigr\} , \end{aligned}$$(18)we have
$$\varphi' \bigl(H(z,w)-H \bigl(z^{*},z^{*} \bigr) \bigr) \operatorname{dist} \bigl(0, \partial H(z,w) \bigr)\geq1. $$Notice that \(H(z_{k+1},z_{k})\to H(z^{*},z^{*})\), \(k\to\infty\), and \(H(z_{k+1},z_{k})>H(z^{*},z^{*})\). Let \(k_{1}\) be such that \(H(z^{*},z^{*})< H(z_{k+1},z_{k})< H(z^{*},z^{*})+\eta\) for all \(k\geq k_{1}\). By Lemma 3.6(i) there exists \(k_{2}\) such that \(\operatorname{dist}((z_{k+1},z_{k}),\Omega)<\epsilon\) for all \(k\geq k_{2}\). Take \(k_{3}=\max\{k_{1},k_{2}\}\). Then, for \(k\geq k_{3}\), \(\{(z_{k+1},z_{k})\}\) belongs to the intersection in (18). Hence
$$\varphi' \bigl(H(z_{k+1},z_{k})-H \bigl(z^{*},z^{*} \bigr) \bigr)\operatorname{dist} \bigl(0,\partial H(z_{k+1},z_{k}) \bigr) \geq1, \quad \forall k\geq k_{3}. $$
Due to the concavity of φ,
By Lemma 3.4 there exist a point \(\omega_{k+1}\in\partial H(z_{k+1},z_{k})\) defined as in (15) and a positive constant \(C_{2}>0\) such that
Thus
From Lemma 3.2 we have \(H(z_{k},z_{k-1})-H(z_{k+1},z_{k})\geq m \Vert z_{k+1}-z_{k} \Vert ^{2}\). Thus
Set \(b_{k}=\frac{C_{2}}{m}(\varphi(H(z_{k},z_{k-1})-H(z^{*},z^{*}))-\varphi (H(z_{k+1},z_{k})-H(z^{*},z^{*})))\geq0\), \(a_{k}= \Vert z_{k}-z_{k-1} \Vert \geq0\). Then (19) can be equivalently rewritten as
Since \(\varphi\geq0\), we know that
and hence \(\sum_{k=1}^{\infty}b_{k}<\infty\). Note that from (20) we have
So Lemma 2.3 gives that \(\sum_{k=1}^{\infty}a_{k}<\infty\), that is, \(\sum_{k=1}^{\infty} \Vert z_{k}-z_{k-1} \Vert <\infty\), which is equivalent to \(\sum_{k=1}^{\infty}( \Vert x_{k}-x_{k-1} \Vert + \Vert y_{k}-y_{k-1} \Vert )<\infty\). This indicates that \(\{z_{k}\}\) is a Cauchy sequence. So \(\{z_{k}\}=\{(x_{k},y_{k})\}\) is convergent. Let \((x_{k},y_{k})\to(x^{*},y^{*}), k\to\infty\). According to Lemma 3.5(iv), it is clear that \((x^{*},y^{*})\) is a critical point of H. □
4 Conclusion
In this paper, we present a proximal alternating minimization algorithm with inertial effect for the minimization problem of the type \(L(x,y)=f(x)+R(x,y)+g(y)\), where f and g are both nonconvex nonsmooth functions, and R is a smooth function. We prove that every bounded sequence generated by the algorithm converges to a critical point of L. The key point is to construct a function H (see (11)) that satisfies the Kurdyka-Lojasiewicz inequality. It is worth mentioning that assumption (H6) requires
which can be achieved by appropriate choice of the parameters.
References
Alvarez, F: On the minimizing property of a second order dissipative system in Hilbert spaces. SIAM J. Control Optim. 38(4), 1102-1119 (2000)
Alvarez, F, Attouch, H: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9, 3-11 (2001)
Mainge, PE, Noudafi, A: Convergence of new inertial proximal methods for DC programming. SIAM J. Optim. 19(1), 397-413 (2008)
Beck, A, Teboulle, M: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183-202 (2009)
Ochs, P, Chen, Y, Brox, T, Pock, T: iPiano: inertial proximal algorithm for non-convex optimization. SIAM J. Imaging Sci. 7, 1388-1419 (2014)
Bot, RI, Csetnek, ER, Hendrich, C: Inertial Douglas-Rachford splitting for monotone inclusion problems. Appl. Math. Comput. 256, 472-487 (2015)
Bot, RI, Csetnek, ER: In inertial alternating direction method of multipliers. Minimax Theory Appl. 1, 29-49 (2016)
Chambolle, A, Dossal, C: On the convergence of the iterates of the ‘fast iterative shrinkage/thresholding algorithm’. J. Optim. Theory Appl. 166, 968-982 (2016)
Chen, C, Ma, S, Yang, J: A general inertial proximal for mixed variational inequality problem. SIAM J. Optim. 25, 2120-2142 (2015)
Dong, QL, Lu, YY, Yang, JF: The extragradient algorithm with inertial effects for solving the variational inequality. Optimization 65(12), 2217-2226 (2016)
Bot, RI, Csetnek, ET, Laszlo, SC: An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions. EURO J. Comput. Optim. 4(1), 1-23 (2016)
Bot, RI, Csetnek, ER: An inertial Tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J. Optim. Theory Appl. 171(2), 600-616 (2016)
Attouch, H, Bolte, J, Redont, P, Soubeyran, A: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Lojasiewicz inequality. Math. Oper. Res. 35(2), 428-457 (2010)
Mordukhovich, B: Variational Analysis and Generalized Differentiation, I: Basic Theory, II: Applications. Springer, Berlin (2006)
Rochafellar, RT, Wets, RJ-B: Variational Analysis. Fundamental Principles of Mathematical Sciences, vol. 317. Springer, Berlin (1998)
Bolte, J, Sabach, S, Teboulle, M: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program., Ser. A 146(1-2), 459-494 (2014)
Bauschke, HH, Combettes, PL: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer, New York (2011)
Acknowledgements
The authors would like to express great thanks to the referees for their valuable comments, which notably improved the presentation of this manuscript. The authors also thank Professor Qiaoli Dong for her helpful advice.
Funding
This research was supported by National Natural Science Foundation of China (No. 61503385), and the Science Research Foundation in CAUC (No. 2011QD02S).
Author information
Authors and Affiliations
Contributions
All the authors contributed, read, and approved this manuscript.
Corresponding author
Ethics declarations
Competing interests
We confirm that we have read SpringerOpen’s guidance on competing interests and none of the authors has any financial and nonfinancial competing interests in the manuscript.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Zhang, Y., He, S. Inertial proximal alternating minimization for nonconvex and nonsmooth problems. J Inequal Appl 2017, 232 (2017). https://doi.org/10.1186/s13660-017-1504-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-017-1504-y