A modified exact smooth penalty function for nonlinear constrained optimization

Bingzhuang, Liu; Wenling, Zhao

doi:10.1186/1029-242X-2012-173

Research
Open access
Published: 06 August 2012

A modified exact smooth penalty function for nonlinear constrained optimization

Liu Bingzhuang¹ &
Zhao Wenling¹

Journal of Inequalities and Applications volume 2012, Article number: 173 (2012) Cite this article

2011 Accesses
Metrics details

Abstract

In this paper, a modified simple penalty function is proposed for a constrained nonlinear programming problem by augmenting the dimension of the program with a variable that controls the weight of the penalty terms. This penalty function enjoys improved smoothness. Under mild conditions, it can be proved to be exact in the sense that local minimizers of the original constrained problem are precisely the local minimizers of the associated penalty problem.

MSC:47H20, 35K55, 90C30.

1 Introduction

Merit function has always taken an important role in optimization problem. It is traditionally constructed to solve nonlinear programs by augmenting the objective function or a corresponding Lagrange function some penalty or barrier terms with respect of the constraints. Then it can be optimized by some unconstrained or bounded constrained optimization softwares or sequential quadratic programming (SQP) techniques. No matter what kind of techniques are involved, the merit function always depends on a small parameter ε or large parameter $ρ = ε^{- 1}$ . As $ε \to 0$ , the minimizer of a merit function such as a barrier function or the quadratic penalty function, converges to a minimizer of the original problem. By using some exact penalty function such as $l_{1}$ penalty function (see [1, 9, 20–22]), the minimizer of the corresponding penalty problem must be a minimizer of the original problem when ε is sufficiently small. There are some nonsmooth penalty functions for nonsmooth optimization problems, such as the exact penalty function using the distance function for the nonsmooth variational inequality problem in Hilbert spaces [18] and the one in [19].

The traditional exact penalty functions [8] are always nonsmooth. When it is used as a merit function to accept a new iterate in an SQP method, it may cause the Maratos effect [13]. On the other hand, a traditional smooth penalty function like the quadratic penalty function cannot be an exact one. So we must compute a sequence of minimization subproblem as $ε \to 0$ . At that time, ill-conditioning may occur when the penalty parameter is too large or small, which also brings difficulty of computation. In [3] and [6], some kinds of augmented Lagrangian penalty functions have been proposed with improved exactness under strong conditions. In [11], exact penalty functions via regularized gap function for variational inequalities have also been given. All these functions enjoy some smoothness, but at the very beginning, to use this smoothness we need second-order or third-order derivative information of the problem function that is difficult to estimate in practice. Besides, all the above kinds of penalty functions (see [2–4, 7, 16] for summary) may be unbounded below even when the constrained problem is bounded, which may make it difficult to locate a minimizer.

In the paper [10], a new penalty function is proposed for the constrained optimization problem. By augmenting the dimension of the program with an additional variable ε that controls the weight of the penalty terms, this new penalty function enjoys properties of smoothness and exactness, and remains bounded below under reasonable conditions. Its important new idea is that the penalty function is considered as a function of variable x and the additional variable ε simultaneously. Under proper assumptions, the minimizer $(x^{*}, ε_{*})$ of the merit function satisfies $ε_{*} = 0$ , and $x^{*}$ is a minimizer of the original problem. However, the penalty function given in [10] is not smooth in a small neighborhood of $(x^{*}, 0)$ , where the minimizer of the original constrained problem lies. In this paper, we give a penalty function which enjoys the properties of the penalty function given in [10] and has improved smoothness.

The rest of this paper is organized as follows. In Section 2, a penalty function is introduced for a smooth nonlinear optimization problem with equality constraints and bounded constraints. The smoothness of this penalty function is discussed, as well as other properties, including being bounded below under mild assumptions. Section 3 shows the exactness of our penalty function in the sense that under certain conditions, local minimizer of our penalty function has the form $(x^{*}, ε_{*})$ with $ε_{*} = 0$ and $x^{*}$ is a local minimizer of the original problem, and a converse result holds.

Notation Throughout this paper, we use the Euclidean norm $∥ x ∥ = \sqrt{\sum x_{k}^{2}}$ . The subvector of x indexed by the indices in J is denoted by $x_{J}$ . We denote sets of the form

[\underset{̲}{x}, \bar{x}] : = x \in R^{n} | {\underset{̲}{x} \leq x \leq \bar{x}},

where the lower bound $\underset{̲}{x} \in {(R \cup {- \infty})}^{n}$ and the upper bound $\bar{x} \in {(R \cup {\infty})}^{n}$ are vectors containing proper or infinite bounds on the components of x and $[\underset{̲}{x}, \bar{x}]$ is referred to an n-dimensional box.

2 New penalty function

We consider the smooth nonlinear optimization problem with equality constraints and bound constraints:

(P) \begin{array}{l} min f (x) \\ s.t. x \in [u, v], F (x) = 0, \end{array}

(2.1)

where $[u, v]$ is a box in $ℜ^{n}$ with nonempty interior, $f : D \to ℜ$ and $F : D \to ℜ^{m}$ are continuously differentiable in an open set D containing $[u, v]$ and $m < n$ . We fix $w \in ℜ^{m}$ and consider the equivalent problem:

(\bar{P}) \begin{array}{l} min f (x) \\ s.t. F_{j} (x) = ε^{γ} w_{j}, j = 1, \dots, m, \\ x \in [u, v], ε = 0, \end{array}

(2.2)

where $γ > 0$ .

Let $\bar{ε} > 0$ be an upper bound of the parameter ε. Then the corresponding penalty function $f_{σ}$ on $D \times [0, \bar{ε}]$ for ( $\bar{P}$ ) is given as follows:

f_{σ} (x, ε) = {\begin{cases} f (x) & if ε = △ (x, ε) = 0, \\ f (x) + \frac{1}{ε^{α}} \frac{△ (x, ε)}{1 - q △ (x, ε)} + σ ε^{β} & if ε > 0, △ (x, ε) < q^{- 1}, \\ + \infty & otherwise \end{cases}

(2.3)

with the constraint violation measure

△ (x, ε) : = {∥ F (x) - ε^{γ} w ∥}^{2},

where, in addition, $γ > α \geq 2 β \geq 2$ and $q > 0$ , are all fixed numbers, $σ > 0$ is a penalty parameter and $∥ \cdot ∥$ is a Euclidean norm, with $∥ x ∥ = \sqrt{x^{T} x}$ for any vector x.

Obviously, $ε = △ (x, ε) = 0$ if and only if $ε = 0$ , $F_{j} (x) = 0$ , $j = 1, \dots, m$ . The corresponding penalty problem then reads

(P_{σ}) \begin{array}{l} min f_{σ} (x, ε) \\ s.t. (x, ε) \in [u, v] \times [0, \bar{ε}] . \end{array}

(2.4)

The main difference between (2.3) and the penalty function given in [10] is that in (2.3), $β (ε) = ε^{β}$ , which does not have the property that $β^{'} (ε) \to + \infty$ , as $ε \to 0^{+}$ .

It is easy to see that $f_{σ} (x, ε)$ is continuously differentiable with respect to $(x, ε)$ on

D_{q} : = {(x, ε) \in D \times (0, \bar{ε}] | 0 \leq △ (x, ε) < \frac{1}{q}} .

2.1 Boundedness of the penalty function

If $F (x) = 0$ , $(x, ε) \in D_{q}$ , then

f_{σ} (x, ε) = f (x) + \frac{ε^{2 γ - α} {∥ w ∥}^{2}}{1 - q ε^{2 γ} {∥ w ∥}^{2}} + σ ε^{β} \geq f_{σ} (x, 0) = f (x) .

(2.5)

Therefore, $f_{σ} (x, ε)$ is bounded below on the set

D^{'} = {x \in [u, v] | ∥ F (x) ∥ \leq q^{- 1 / 2} + {\bar{ε}}^{γ} ∥ w ∥},

whenever $f (x)$ is bounded below on the set $D^{'}$ . This is a reasonable condition since it usually holds when f is bounded below on the feasible set, $\bar{ε}$ is small enough, and q is large enough.

The denominator $1 - q △ (x, ε)$ is included since it forces the level sets of $f_{σ}$ to remain in the set ${(x, ε) \in ℜ^{n + 1} | △ (x, ε) < q^{- 1}}$ , hence in some sense does not go far away from the feasible set of (P).

Now we see a simple example:

\begin{array}{l} min x^{7} \\ s.t. x^{2} - 1 = 0, \\ x \in ℜ . \end{array}

It has a bounded feasible domain, a global minimizer at $x^{*} = - 1$ with $f (x^{*}) = - 1$ , and a local minimizer $x = 1$ . The traditional quadratic penalty function for this problem

P (x) = x^{7} + \frac{1}{ε} {(x^{2} - 1)}^{2},

is unbounded below for all penalty parameters $ε > 0$ since, e.g., $p (x) \to - \infty$ for $x = - s$ , $s \to + \infty$ . It is also the case for traditional penalty functions, including multiplier penalty functions that use an additional term $+ λ (x^{2} - 1)$ . On the other hand, our new penalty function is bounded below. Set $w = 1$ , it reads

f_{σ} (x, ε) = {\begin{cases} x^{7} & if ε = r = 0, \\ x^{7} + \frac{1}{ε^{α}} \frac{r^{2}}{1 - q r^{2}} + σ ε^{β} & if ε > 0, | r | < q^{- 1 / 2}, \\ + \infty & otherwise, \end{cases}

where $r = 1 + ε^{γ} - x^{2}$ . Since $f_{σ} (x, ε) = + \infty$ , if $| x | \geq \sqrt{q^{- 1 / 2} + 1 + ε^{γ}}$ , it is obvious that our penalty function is bounded below. See Figure 1 for the display of the contour of the penalty function on this example.

3 Exactness of the penalty function

In this section, we show that our penalty function is exact in the sense that under certain conditions, local minimizer of our penalty function has the form $(x^{*}, ε_{*})$ in which $ε_{*} = 0$ and $x^{*}$ is a local minimizer of the original problem and a converse proposition holds.

Firstly, recall the Mangasarian-Fromovitz condition. We say that the Mangasarian-Fromovitz condition (see [12]) for Problem (P) holds at $x \in [u, v]$ if $F^{'} (x)$ has full rank and there is a vector $p \in ℜ^{n}$ with $F^{'} (x) p = 0$ and

p_{i} {\begin{cases} > 0 & if x_{i} = u_{i}, \\ < 0 & if x_{i} = v_{i} . \end{cases}

(3.1)

Theorem 3.1 Assume that the set $D^{'}$ defined in Section 2 is bounded, and the Mangasarian-Fromovitz condition holds at each $x^{'} \in D^{'}$ . Let $γ > α \geq 2 β \geq 2$ in ( $P_{σ}$ ). If σ is sufficiently large, then there is no Kuhn-Tucker point $(x, ε)$ of ( $P_{σ}$ ) with $ε > 0$ .

Proof Let the Lagrangian function of ( $P_{σ}$ ) for $σ > 0$ be

L (x, ε, y, z) = f_{σ} (x, ε) + \sum_{i = 1}^{n} y_{i} (u_{i} - x_{i}) + \sum_{i = 1}^{n} z_{i} (x_{i} - v_{i}) + y_{n + 1} (- ε) + z_{n + 1} (ε - \bar{ε}),

where $y_{i}, z_{i} \in ℜ$ , $i = 1, \dots, n + 1$ are the Lagrangian multipliers. If $(x, ε)$ is a Kuhn-Tucker point of ( $P_{σ}$ ) with $ε > 0$ , then there exist vectors $y, z \in ℜ^{n + 1}$ such that

\begin{matrix} \nabla_{(x, ε)} L (x, ε, y, z) = 0, \\ u_{i} - x_{i} \leq 0, x_{i} - v_{i} \leq 0, i = 1, \dots, n, \\ - ε < 0, ε - \bar{ε} \leq 0, \\ y_{i} (u_{i} - x_{i}) = 0, z_{i} (v_{i} - x_{i}) = 0, z_{n + 1} (ε - \bar{ε}) = 0, y_{n + 1} ε = 0, \end{matrix}

then

\nabla_{(x, ε)} f_{σ} (x, ε) = y - z,

and

\begin{matrix} inf (y_{i}, x_{i} - u_{i}) = inf (z_{i}, v_{i} - x_{i}) = 0, i = 1, \dots, n, \\ y_{n + 1} = inf (z_{n + 1}, \bar{ε} - ε) = 0, \end{matrix}

where $\nabla_{(x, ε)} f_{σ} (x, ε)$ is the gradient of $f_{σ}$ with respect to $(x, ε)$ . The assertion of the theorem is proved by contradiction. □

Assume that there exists a sequence ${(x^{k}, ε_{k}, σ_{k})}$ with $ε_{k} > 0$ for all k, $σ_{k} \to + \infty$ as $k \to \infty$ , where $(x^{k}, ε_{k})$ is a Kuhn-Tucker point of ( $P_{σ_{k}}$ ). We use the abbreviation $△_{k} : = △ (x^{k}, ε_{k})$ . The point $x^{k}$ satisfies

∥ F (x^{k}) ∥ \leq △_{k}^{1 / 2} + ε_{k}^{γ} ∥ w ∥ \leq q^{- 1 / 2} + {\bar{ε}}^{γ} ∥ w ∥,

hence, $x^{k} \in D^{'} = {x \in [u, v] | ∥ F (x) ∥ \leq q^{- 1 / 2} + {\bar{ε}}^{γ} ∥ w ∥}$ . Since $D^{'}$ is closed and bounded, we may restrict ourselves to a subsequence if necessary and assume that

lim_{k \to \infty} ε_{k} = ε_{*} \in [0, \bar{ε}] and lim_{k \to \infty} x^{k} = x^{*} \in D^{'} .

The condition $\frac{\partial}{\partial ε} f_{σ} (x^{k}, ε_{k}) \leq 0$ yields

\begin{aligned} α q △_{k}^{2} + (2 γ - α) ε_{k}^{2 γ} {∥ w ∥}^{2} + 2 (α - γ) ε_{k}^{γ} \sum_{j = 1}^{m} F_{j} (x^{k}) w_{j} \\ + β ε_{k}^{α + β} {(1 - q △_{k})}^{2} σ_{k} \leq α \sum_{j = 1}^{m} F_{j}^{2} (x^{k}) \end{aligned}

(3.2)

with equality in the case $ε_{k} \neq \bar{ε}$ . When $σ_{k} \to \infty$ and because $α \sum_{j = 1}^{m} F_{j}^{2} (x^{k})$ - the right side of (3.2) is a finite number, we have

lim_{k \to \infty} ε_{k} = ε_{*} = 0 or lim_{k \to \infty} △ (x^{k}, ε_{k}) = △^{*} = q^{- 1},

(3.3)

where $△^{*} = △ (x^{*}, ε_{*})$ . On the other side, the derivative the penalty function $f_{σ}$ with respect to x is given as

\begin{array}{rcl} \frac{\partial}{\partial x_{i}} f_{σ} (x^{k}, ε_{k}) & = & \frac{\partial}{\partial x_{i}} f (x^{k}) + \frac{1}{ε_{k}^{α}} \frac{\frac{\partial}{\partial x_{i}} △ (x^{k}, ε_{k})}{1 - q △ (x^{k}, ε_{k})} + \frac{1}{ε_{k}^{α}} \frac{q △ (x^{k}, ε_{k}) \frac{\partial}{\partial x_{i}} △ (x^{k}, ε_{k})}{{(1 - q △ (x^{k}, ε_{k}))}^{2}} \\ = & \frac{\partial}{\partial x_{i}} f (x^{k}) + \frac{1}{ε_{k}^{α}} \frac{\frac{\partial}{\partial x_{i}} △ (x^{k}, ε_{k})}{{(1 - q △ (x^{k}, ε_{k}))}^{2}} {\begin{cases} \geq 0 & if x_{i}^{k} = u_{i}, \\ = 0 & if u_{i} < x_{i}^{k} < v_{i}, \\ \leq 0 & if x_{i}^{k} = v_{i} \end{cases} \end{array}

(3.4)

(3.4) is equivalent to

\begin{aligned} ε_{k}^{α} {(1 - q △ (x^{k}, ε_{k}))}^{2} \frac{\partial}{\partial x_{i}} f (x^{k}) + \frac{\partial}{\partial x_{i}} △ (x^{k}, ε_{k}) \\ = ε_{k}^{α} {(1 - q △ (x^{k}, ε_{k}))}^{2} \frac{\partial}{\partial x_{i}} f (x^{k}) + 2 {(F^{'} {(x^{k})}^{T} (F (x^{k}) - ε_{k}^{γ} w))}_{i} \\ {\begin{cases} \geq 0 & if x_{i}^{k} = u_{i}, \\ = 0 & if u_{i} < x_{i}^{k} < v_{i}, \\ \leq 0 & if x_{i}^{k} = v_{i} . \end{cases} \end{aligned}

(3.5)

Let $k \to + \infty$ , since $ε_{*} = 0$ or $△^{*} = q^{- 1}$ , we have

{(F^{'} {(x^{*})}^{T} (F (x^{*}) - ε_{*}^{γ} w))}_{i} {\begin{cases} \geq 0 & if x_{i}^{*} = u_{i}, \\ = 0 & if u_{i} < x_{i}^{*} < v_{i}, \\ \leq 0 & if x_{i}^{*} = v_{i} . \end{cases}

(3.6)

Because $x^{*} \in D^{'}$ , thus Mangasarian-Fromovitz condition holds at $x^{*}$ and there exists some vector $p \in ℜ^{n}$ such that $F^{'} (x^{*}) p = 0$ , where p satisfies (3.1). Let $I_{1} : = {i | x_{i}^{*} = u_{i}}$ , $I_{2} : = {i | x_{i}^{*} = v_{i}}$ and $\bar{△^{*}} : = F (x^{*}) - ε_{*}^{γ} w$ . Then by (3.6), we have

0 = {(F^{'} (x^{*}) p)}^{T} \bar{△^{*}} = \sum_{i \in I_{1}} p_{i} {(F^{'} {(x^{*})}^{T} \bar{△^{*}})}_{i} + \sum_{i \in I_{2}} p_{i} {(F^{'} {(x^{*})}^{T} \bar{△^{*}})}_{i} .

(3.7)

(3.7) and the Mangasarian-Fromovitz condition (3.1) imply ${(F^{'} {(x^{*})}^{T} \bar{△^{*}})}_{i} = 0$ for $i \in I_{1} \cup I_{2}$ . Thus, $F^{'} {(x^{*})}^{T} \bar{△^{*}} = 0$ . Now the fact that $F^{'} (x^{*})$ has full rank yields $\bar{△^{*}} = 0$ , i.e.,

F (x^{*}) - ε_{*}^{γ} w = 0,

and by $\bar{△^{*}} = {lim}_{k \to \infty} (F (x^{k}) - ε_{k}^{γ} w) = 0$ , it follows that

lim_{k \to \infty} △ (x^{k}, ε_{k}) = lim_{k \to \infty} {∥ F (x^{k}) - ε_{k}^{γ} w ∥}^{2} = △^{*} = 0 .

By (3.3), we obtain

lim_{k \to \infty} ε_{k} = ε_{*} = 0 .

Thus, ${lim}_{k \to \infty} F (x^{k}) = F (x^{*}) = 0$ .

Furthermore, by (3.2), it holds that

\begin{aligned} \frac{q}{ε_{k}^{α + β}} △^{2} (x^{k}, ε_{k}) + \frac{2 \frac{γ}{α} - 1}{ε_{k}^{α + β - 2 γ}} {∥ w ∥}^{2} + 2 (1 - \frac{γ}{α}) ε_{k}^{γ - (α + β)} \sum_{i = 1}^{m} F_{i} (x^{k}) w_{i} \\ + \frac{β}{α} {(1 - q △ (x^{k}, ε_{k}))}^{2} σ_{k} \leq \frac{\sum_{i = 1}^{m} F_{i}^{2} (x^{k})}{ε_{k}^{α + β}} . \end{aligned}

Let $k \to \infty$ , the last term on the left-hand side tend to +∞. Thus, the vectors $y^{k} = \frac{F (x^{k})}{ε_{k}^{\frac{α + β}{2}}}$ satisfies $∥ y^{k} ∥ \to + \infty$ . The vectors $z^{k} = \frac{y^{k}}{∥ y^{k} ∥}$ have norm 1, and (3.5) implies that the numbers $μ_{i}^{k}$ ( $i = 1, \dots, n$ ), defined by

\begin{array}{rcl} μ_{i}^{k} & = & \frac{1}{∥ y^{k} ∥} \frac{\partial}{\partial x_{i}} f (x^{k}) + \frac{1}{∥ y^{k} ∥} \frac{2}{ε_{k}^{α} {(1 - q △ (x^{k}, ε_{k}))}^{2}} {(F^{'} {(x^{k})}^{T} (F (x_{k}) - ε_{k}^{γ} w))}_{i} \\ = & \frac{1}{∥ y^{k} ∥} \frac{\partial}{\partial x_{i}} f (x^{k}) + \frac{2}{ε_{k}^{\frac{α - β}{2}} {(1 - q △ (x^{k}, ε_{k}))}^{2}} {(F^{'} {(x^{k})}^{T} (z^{k} - \frac{ε_{k}^{γ - \frac{α + β}{2}} w}{∥ y^{k} ∥}))}_{i}, \end{array}

satisfy

μ_{i}^{k} {\begin{cases} \geq 0 & if x_{i}^{k} = u_{i}, \\ = 0 & if u_{i} < x_{i}^{k} < v_{i}, \\ \leq 0 & if x_{i}^{k} = v_{i} . \end{cases}

If we pick a convergent subsequence $z^{n_{k}}$ with the limit $z^{*}$ and pass to the limit we obtain

{(F^{'} (x^{*}) z^{*})}_{i} {\begin{cases} \geq 0 & if x_{i}^{*} = u_{i}, \\ = 0 & if u_{i} < x_{i}^{*} < v_{i}, \\ \leq 0 & if x_{i}^{*} = v_{i} . \end{cases}

Now similarly as above, it yields $z^{*} = 0$ , which is a contradiction with $∥ z^{*} ∥ = 1$ . Thus such a sequence ${(x^{k}, ε_{k}, σ_{k})}$ cannot exist, and for sufficiently large $σ > 0$ , all Kuhn-Tucker points of ( $P_{σ}$ ) are of the form $(x, 0)$ .

Theorem 3.2 Assume that $(x^{*}, ε_{*})$ is a local minimizer of minimizer of ( $P_{σ}$ ) with finite $f_{σ} (x^{*}, ε_{*})$ , where $σ > 0$ is sufficiently large. If the hypotheses of Theorem 3.1 are fulfilled, then $x^{*}$ is a local minimizer of (P).

Proof Now let $(x^{*}, ε_{*})$ be a local minimizer of ( $P_{σ}$ ) with finite $f_{σ} (x^{*}, ε_{*})$ and $σ > 0$ is sufficiently large. If $ε_{*} > 0$ , then $(x^{*}, ε_{*})$ must be a Kuhn-Tucker point of ( $P_{σ}$ ), which is a contradiction with Theorem 3.1. Therefore, $ε_{*} = 0$ , and since $f_{σ} (x^{*}, ε_{*})$ is finite, $△ (x^{*}, ε_{*}) = 0$ . It implies that $F (x^{*}) = 0$ , and by (2.5) there is a neighborhood $N (x^{*})$ of $x^{*}$ where $f (x) \geq f (x^{*})$ for feasible x. Therefore, $x^{*}$ is a local minimizer of (P). □

We now show a converse result of Theorem 3.2, which will use the following lemmas.

Lemma 3.1 Suppose $F (x^{*}) = 0$ , and $F^{'} (x^{*})$ has full rank. Then there exist a neighborhood $N_{0} (x^{*})$ of $x^{*}$ and a constant $κ_{0} > 0$ such that for each $x \in N_{0} (x^{*})$ , and each subset J of ${1, 2, \dots, m}$ , there exists a vector $y = y (x) \in N_{0} (x^{*})$ with $F_{i} (y) = 0$ , for $i \in J$ and $F_{i} (y) = F_{i} (x)$ , $i \in K = {1, 2, \dots, m} ∖ J$ , such that

∥ x - y ∥ \leq κ_{0} ∥ F_{J} (x) ∥ .

Proof Since $F (x^{*}) = 0$ and $F^{'} (x^{*})$ has full rank, there exists a matrix $B \in ℜ^{(n - m) \times n}$ such that the augmented matrix $(\begin{array}{c} F^{'} (x^{*}) \\ B \end{array})$ is nonsingular. By the continuity of $F^{'} (\cdot)$ at $x^{*}$ , there exists a neighborhood $N_{1} (x^{*}) \subset D$ of $x^{*}$ such that $(\begin{array}{c} F^{'} (x) \\ B \end{array})$ is nonsingular, for any $x \in N_{1} (x^{*})$ . Take for $A$ the closed convex hull of ${F^{'} (x) | x \in N_{1} (x^{*})}$ , then for all $A \in A$ , the matrix $(\begin{array}{c} A \\ B \end{array})$ is nonsingular. We now show that for any $x, y \in N_{1} (x^{*})$ , there exists a matrix $A \in A$ such that

F (x) - F (y) = A (x - y) .

(3.8)

In fact, given $x, y \in N_{1} (x^{*})$ , it follows from the mean value theorem that

\begin{aligned} F (x) - F (y) & = \int_{0}^{1} F^{'} (y + s (x - y)) (x - y) d s \\ = A_{x, y} (x - y), \end{aligned}

where $A_{x, y} = \int_{0}^{1} F^{'} (y + s (x - y)) d s \in A$ , so (3.8) holds. Set the mapping $H (z) : = (\begin{array}{c} F (z) \\ B (z - x^{*}) \end{array})$ , for $z \in N_{1} (x^{*})$ . By the proof in [10], Theorem 4.5], we have that there exists a neighborhood $N_{0} (x^{*}) \subset N_{1} (x^{*})$ of $x^{*}$ such that for each $x \in N_{0} (x^{*})$ , and each subset J of ${1, 2, \dots, m}$ , there exists a vector $y = y (x) \in N_{0} (x^{*})$ with

H (y) = (\begin{array}{c} F (y) \\ B (y - x^{*}) \end{array}) = (\begin{array}{c} 0 \\ F_{K} (x) \\ B (x - x^{*}) \end{array}),

so $F_{i} (y) = 0$ , for $i \in J$ and $F_{i} (y) = F_{i} (x)$ , $i \in K$ .

For $x, y \in N_{0} (x^{*})$ , we have

H (x) - H (y) = (\begin{array}{c} A \\ B \end{array}) (x - y),

for some $A \in A$ . On the other side, we have

∥ H (x) - H (y) ∥ = ∥ F_{J} (x) ∥ .

(3.9)

Therefore, combining (3.8) with (3.9), we have

∥ x - y ∥ \leq ∥ {(\begin{array}{c} A \\ B \end{array})}^{- 1} ∥ ∥ F_{J} (x) ∥ \leq κ_{0} ∥ F_{J} (x) ∥,

where

κ_{0} : = ∥ sup_{A \in A} {(\begin{array}{c} A \\ B \end{array})}^{- 1} ∥ < + \infty,

and this complete the proof. □

Lemma 3.2 Assume that $x^{*}$ is a local minimizer of problem (2.1) with $u_{i} \leq x_{i}^{*} \leq v_{i}$ , $i \in {1, \dots, p}$ , where $m \leq p \leq n$ . Suppose that $F^{'} (x^{*})$ has full rank. Then there exists a constant $κ_{1} > 0$ such that

f (x) \geq f (x^{*}) - κ_{1} ∥ F (x) ∥, for all x \in N_{0} (x^{*}),

where $N_{0} (x^{*})$ is defined in Lemma 3.1.

Proof From Lemma 3.1, let $N_{0} (x^{*})$ and $κ_{0}$ be the one in Lemma 3.1. Let $x \in N_{0} (x^{*})$ , then by Lemma 3.1 with $J = {1, \dots, m}$ , there exists an $y = y (x)$ with $F (y) = 0$ and $y_{i} = x_{i}^{*}$ , $i = m + 1, \dots, n$ such that

∥ x - y ∥ \leq κ_{0} ∥ F (x) ∥ .

(3.10)

So y is a feasible point of problem (2.1), and $f (y) \geq f (x^{*})$ . Since f is continuously differentiable, for any $x, y \in N_{0} (x^{*})$ , there exists a vector $ξ \in ℜ^{n}$ such that

f (x) - f (y) = \nabla f {(ξ)}^{T} (x - y),

where ξ lies in the segment between x and y. Set $L : = {sup}_{z \in N (x^{*})} ∥ \nabla f (z) ∥$ , we have

\begin{aligned} | f (x) - f (y) | & \leq ∥ \nabla f (ξ) ∥ ∥ x - y ∥ \\ \leq L ∥ x - y ∥ \\ \leq L κ_{0} ∥ F (x) ∥, \end{aligned}

where the last inequality holds from (3.10).

Let $κ_{1} = L κ_{0}$ , then

f (x) = f (x) - f (y) + f (y) \geq f (x^{*}) - κ_{1} ∥ F (x) ∥,

which complete the proof. □

Theorem 3.3 If $x^{*}$ is a local minimizer of problem (P) with $u_{i} \leq x_{i}^{*} \leq v_{i}$ , $i \in {1, \dots, p}$ , where $m \leq p \leq n$ , and $F^{'} (x^{*})$ has full rank, then for sufficiently large $σ > 0$ , there are a neighborhood $N (x^{*})$ of $x^{*}$ and a $ε^{'} \in (0, \bar{ε}]$ such that

f_{σ} (x, ε) > f_{σ} (x^{*}, 0) = f (x^{*}), for all (x, ε) \in N (x^{*}) \times (0, ε^{'}] .

In particular, $(x^{*}, 0)$ is a local minimizer of $f_{σ}$ .

Proof Let $N (x^{*}) \subset N_{0} (x^{*})$ is a neighborhood of $x^{*}$ such that

sup_{x \in N (x^{*})} (f (x^{*}) - f (x)) < 1,

(3.11)

where $N_{0} (x^{*})$ is defined in Lemma 3.1.

For $(x, ε) \in N (x^{*}) \times (0, ε^{'}]$ , where $ε^{'} \in (0, \bar{ε}]$ and $ε^{'} \leq 1$ , we distinguish two cases.

Case 1. $△ (x, ε) = {∥ F (x) - ε^{γ} w ∥}^{2} \geq ε^{α}$ . For this case, we have

\begin{aligned} f_{σ} (x, ε) & \geq f (x) + 1 + σ ε^{β} \\ \geq f (x^{*}) + σ ε^{β} \\ > f (x^{*}) = f_{σ} (x^{*}, 0), \end{aligned}

where the second inequality is by (3.11).

Case 2. $△ (x, ε) < ε^{α}$ . Then $∥ F (x) ∥ \leq △^{\frac{1}{2}} + ε^{γ} ∥ w ∥ \leq ε^{\frac{α}{2}} + ε^{γ} ∥ w ∥$ , and

\begin{aligned} f_{σ} (x, ε) & \geq f (x) + σ ε^{β} \\ \geq f (x^{*}) - κ_{1} ∥ F (x) ∥ + σ ε^{β} \\ \geq f (x^{*}) - κ_{1} (ε^{\frac{α}{2}} + ε^{γ} ∥ w ∥) + σ ε^{β} \\ \geq f (x^{*}) + (σ - κ_{1} (1 + ε^{γ - \frac{α}{2}}) ∥ w ∥) ε^{\frac{α}{2}} . \end{aligned}

(3.12)

The last inequality holds since $β \leq \frac{α}{2}$ .

Let $σ > κ_{1} (1 + ε^{γ - \frac{α}{2}}) ∥ w ∥$ , we get

f_{σ} (x, ε) \geq f (x^{*}) = f_{σ} (x^{*}, 0) .

From Case 1 and Case 2, we obtain the conclusion. □

4 Conclusion remarks

In this paper, a modified exact penalty function for equality constrained nonlinear programming problem is constructed by augmenting a new variable that controls the constraint violence. This function enjoys smoothness, and with very mild conditions it is proved to be an exact penalty function.

Since in practice, a lot of applied problems are nonsmooth, it is a meaningful work to extend the results in this paper to the nonsmooth case. By using the limiting subgradients that is presented in two books written by Mordukhovich [14, 15], as well as Clarke’s generalized gradients in [5], we can extend the penalty function with the mentioned good properties to nonsmooth optimization problems, just as that has been done in [17–19]. That will be our future research direction.

References

Antczak T: Exact penalty functions method for mathematical programming problems involving invex functions. Eur. J. Oper. Res. 2009, 198: 29–36. 10.1016/j.ejor.2008.07.031
Article MathSciNet Google Scholar
Bazaraa MS, Sherali HD, Shetty CM: Nonlinear Optimization Theory and Algorithms. 2nd edition. Wiley, New York; 1993.
Google Scholar
Bertsekas DP: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York; 1982.
Google Scholar
Boukari D, Fiacco AV: Survey of penalty, exact-penalty and multiplier methods from 1968 to 1993. Optimization 1995, 32: 301–334. 10.1080/02331939508844053
Article MathSciNet Google Scholar
Clarke FH: Optimization and Nonsmooth Analysis. Wiley-Interscience, New York; 1983.
Google Scholar
Di Pillo G: Exact penalty methods. In Algorithms for Continuous Optimization. Edited by: Spedicato E. Kluwer Academic, Dordrecht; 1994:209–253.
Chapter Google Scholar
Fletcher R: Practical Methods of Optimization. 2nd edition. Wiley, New York; 1987.
Google Scholar
Han SP, Mangasarian OL: Exact penalty functions in nonlinear programming. Math. Program. 1979, 17: 251–269. 10.1007/BF01588250
Article MathSciNet Google Scholar
Hoheisel T, Kanzow C, Outrata J: Exact penalty results for mathematical programs with vanishing constraints. Nonlinear Anal. 2010, 72: 2514–2526. 10.1016/j.na.2009.10.047
Article MathSciNet Google Scholar
Huyer W, Neumaier A: A new exact penalty function. SIAM J. Optim. 2003, 13: 1141–1158. 10.1137/S1052623401390537
Article MathSciNet Google Scholar
Li W, Peng J: Exact penalty functions for constrained minimization problems via regularized gap function for variational inequality. J. Glob. Optim. 2007, 37: 85–94.
Article MathSciNet Google Scholar
Mangasarian OL, Fromovitz S: The Fritz John necessary optimality conditions in the presence of equality and inequality constraints. J. Math. Anal. Appl. 1967, 17: 37–47. 10.1016/0022-247X(67)90163-1
Article MathSciNet Google Scholar
Maratos, N: Exact penalty function algorithms for finite dimensional and control optimization problems. PhD thesis, University of London (1978)
Google Scholar
Mordukhovich BS Grundlehren Series (Fundamental Principles of Mathematical Sciences) 330. In Variations Analysis and Generalized Differentiation, I: Basic Theory. Springer, Berlin; 2006.
Google Scholar
Mordukhovich BS Grundlehren Series (Fundamental Principles of Mathematical Sciences) 331. In Variations Analysis and Generalized Differentiation, II: Applications. Springer, Berlin; 2006.
Google Scholar
Nocedal J, Wright SJ: Numerical Optimization. Springer, New York; 1999.
Book Google Scholar
Soleimani-damaneh M: The gap function for optimization problems in Banach spaces. Nonlinear Anal. 2008, 69: 716–723. 10.1016/j.na.2007.06.008
Article MathSciNet Google Scholar
Soleimani-damaneh M: Penalization for variational inequalities. Appl. Math. Lett. 2009, 22: 347–350. 10.1016/j.aml.2008.03.029
Article MathSciNet Google Scholar
Soleimani-damaneh M: Nonsmooth optimization using Mordukhovich’s subdifferential. SIAM J. Control Optim. 2010, 48: 3403–3432. 10.1137/070710664
Article MathSciNet Google Scholar
Zangwill WI: Nonlinear programming via penalty function. Manag. Sci. 1967, 13: 344–358. 10.1287/mnsc.13.5.344
Article MathSciNet Google Scholar
Zaslavski AJ: A sufficient condition for exact penalty in constrained optimization. SIAM J. Optim. 2005, 16: 250–262. 10.1137/040612294
Article MathSciNet Google Scholar
Zaslavski AJ: Stability of exact penalty for nonconvex inequality-constrained minimization problems. Taiwan. J. Math. 2010, 14: 1–19.
MathSciNet Google Scholar

Download references

Acknowledgements

The authors wish to thank the anonymous referees for their endeavors and valuable comments. The authors would also like to thank Professor Zhang Liansheng for some very helpful comments on a preliminary version of this paper. This research was supported by the National Natural Science Foundation of China under Grants 10971118, 11101248, Natural Science Foundation of Shandong Province under Grants ZR2012AM016, and the foundation 4041-409012 of Shandong University of Technology.

Author information

Authors and Affiliations

School of Science, Shandong University of Technology, Zibo, Shandong, 255049, P.R. China
Liu Bingzhuang & Zhao Wenling

Authors

Liu Bingzhuang
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Wenling
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liu Bingzhuang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Bingzhuang, L., Wenling, Z. A modified exact smooth penalty function for nonlinear constrained optimization. J Inequal Appl 2012, 173 (2012). https://doi.org/10.1186/1029-242X-2012-173

Download citation

Received: 13 February 2012
Accepted: 26 July 2012
Published: 06 August 2012
DOI: https://doi.org/10.1186/1029-242X-2012-173

A modified exact smooth penalty function for nonlinear constrained optimization

Abstract

1 Introduction

2 New penalty function

2.1 Boundedness of the penalty function

3 Exactness of the penalty function

4 Conclusion remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords