Second-order optimality conditions for nonlinear programs and mathematical programs
- Ikram Daidai^{1}Email author
https://doi.org/10.1186/s13660-017-1487-8
© The Author(s) 2017
Received: 2 June 2017
Accepted: 25 August 2017
Published: 8 September 2017
Abstract
It is well known that second-order information is a basic tool notably in optimality conditions and numerical algorithms. In this work, we present a generalization of optimality conditions to strongly convex functions of order γ with the help of first- and second-order approximations derived from (Optimization 40(3):229-246, 2011) and we study their characterization. Further, we give an example of such a function that arises quite naturally in nonlinear analysis and optimization. An extension of Newton’s method is also given and proved to solve Euler equation with second-order approximation data.
Keywords
1 Introduction
The concept of approximations of mappings was introduced by Thibault [2]. Sweetser [3] considered approximations by subsets of the space of continuous linear maps \(L(X,Y)\), where X and Y are Banach spaces, and Ioffe [4] by the so-called fans. This approach was revised by Jourani and Thibault [5]. Another approach belongs to Allali and Amahroq [1]. Following the same ideas, Amahroq and Gadhi [6, 7] have established optimality conditions to some optimization problems under set-valued mapping constraints.
The rest of the paper is written as follows. Section 2 contains basic definitions and preliminary results. Section 3 is devoted to mains results. In Section 4, we point out an extension of Newton’s method and prove its local convergence.
2 Preliminaries
Let X and Y be two Banach spaces. We denote by \(\mathcal{L}(X,Y)\) the set of all continuous linear mappings from X into Y, by \(\mathcal{B}(X\times X,Y)\) the set of all continuous bilinear mappings from \(X\times X\) into Y, and by \(\mathbb{B}_{Y}\) the closed unit ball of Y centered at the origin.
Throughout this paper, \(X^{*}\) and \(Y^{*}\) denote the continuous duals of X and Y, respectively, and we write \(\langle\cdot,\cdot\rangle\) for the canonical bilinear forms with respect to the dualities \(\langle X^{*},X\rangle\) and \(\langle Y^{*},Y\rangle\).
Definition 1
[1]
Remark 1
If \(\mathcal{A}_{f}(\bar{x})\) is norm-bounded (resp. compact), then it is called a bounded (resp. compact) first-order approximation. Recall that \(\mathcal{A}_{f}(\bar{x})\) is a singleton if and only if f is Fréchet differentiable at x̄.
The following proposition proved by Allali and Amahroq [1] plays an important role in the sequel in a finite-dimensional setting.
Proposition 1
[1]
In [6], it is also shown that when f is a continuous function, it admits as an approximation the symmetric subdifferential defined and studied in [16].
The next proposition shows that Proposition 1 holds also when f is a vector-valued function. Let us first recall the definition of the generalized Jacobian for a vector-valued function (see [17, 18] for more details) and the definition of upper semicontinuity.
Definition 2
Definition 3
Proposition 2
Let \(g: \mathbb{R}^{p} \to\mathbb{R}^{q}\) be a locally Lipschitz function at x̄. Then the generalized Jacobian \(\partial_{c} g(\bar {x})\) of g at x̄ is a first-order approximation of g at x̄.
Proof
Recall that a mapping \(f: X \to Y\) is said to be \(C^{1,1}\) at x̄ if it is Fréchet differentiable in neighborhood of x̄ and if its Fréchet derivative \(\nabla f(\cdot)\) is Lipschitz at x̄.
Corollary 1
Let \(\bar{x}\in\mathbb{R}^{p}\), and \(f: \mathbb{R}^{p} \rightarrow \mathbb{R}\) be a \(C^{1,1}\) function at x̄. Then, ∇f admits \(\partial^{2}_{H} f(\bar{x})\) as a first-order approximation at x̄.
Definition 4
[1]
- (i)
\(\mathcal{A}_{f} (\bar{x})\) is a first-order approximation of f at x̄;
- (ii)For all \(\varepsilon>0\), there exists \(\delta>0\) such thatfor all \(x\in\bar{x}+\delta\mathbb{B}_{X}\).$$f(x)-f(\bar{x})\in\mathcal{A}_{f} (\bar{x}) (x-\bar{x})+ \mathcal{B}_{f} (\bar{x}) (x-\bar{x}) (x-\bar{x})+\varepsilon \Vert x- \bar{x} \Vert ^{2}\mathbb{B}_{Y} $$
In this case the pair \((\mathcal{A}_{f} (\bar{x}),\mathcal{B}_{f} (\bar {x}))\) is called a second-order approximation of f at x̄. It is called a compact second-order approximation if \(\mathcal{A}_{f} (\bar {x})\) and \(\mathcal{B}_{f} (\bar{x})\) are compacts.
Every \(C^{2}\) mapping \(f: X \to Y\) at x̄ admits \((\nabla f(\bar {x}), \nabla^{2} f(\bar{x}))\) as a second-order approximation, where \(\nabla f(\bar{x})\) and \(\nabla^{2} f(\bar{x})\) are, respectively, the first- and second-order Fréchet derivatives of f at x̄.
Proposition 3
[1]
Let \(f: \mathbb{R}^{p} \rightarrow\mathbb{R}\) be a \(C^{1,1}\) function at x̄. Then f admits \((\nabla f(\bar{x}),\frac {1}{2}\partial ^{2}_{H} f(\bar{x}))\) as a second-order approximation at x̄.
Proposition 4
To derive some results for γ-strong convex functions, the following notions are needed.
Definition 5
[8]
Definition 6
[20]
3 Main results
In this section, we obtain the main results of the paper related to strongly convex functions of order γ defined by (7)-(8). We begin by showing some interesting facts of functions that admit a first-order approximation.
Theorem 1
Proof
Proposition 5
Let \(f: X \to\mathbb{R}\cup\{+\infty\}\) be a γ-strongly convex function. Assume that \(\mathcal{A}_{f}(\bar{x})\) is a compact approximation at x̄. Then \(\mathcal{A}_{f}(\bar{x})\cap \partial _{(\gamma,c)}f(\bar{x})\neq \emptyset\).
Proof
Following a result by Rademacher, which states that a locally Lipschitzian function between finite-dimensional spaces is differentiable (Lebesgue) almost everywhere, we can prove the following result.
Proposition 6
Let \(\gamma\geq1\), \(\bar{x}\in\mathbb{R}^{p}\), and let \(f: \mathbb {R}^{p} \to\mathbb{R}\) be continuous at x̄. Assume that f is a γ-strongly convex function. Then \(\partial_{c} f (\bar{x})= \partial_{(\gamma,c)}f(\bar{x})\).
Proof
Corollary 2
Proof
It is clear that \(\partial_{c} f (\bar{x})\) is a first-order approximation of at x̄. We end the proof by Propositions 1 and 6. □
The converse of Proposition 5 holds if (16) is valid for any \(A\in\mathcal{A}_{f}(x)\) and \(x\in X\).
Proposition 7
Let \(\gamma\geq1\) and \(f:X\to\mathbb{R}\cup\{+\infty\}\). Assume that, for each \(x\in X\), f admits a first-order approximation \(\mathcal{A}_{f}(x)\) such that \(\mathcal{A}_{f}(x)\subset\partial _{(\gamma ,c)} f(x)\). Then f is γ-strongly convex.
Proof
The next results are devoted to presenting some useful properties of the generalized Hessian matrix for a \(C^{1,1}\) function in the finite-dimensional setting and a characterization of γ-strongly convex functions with the help of a second-order approximation.
Proposition 8
Proof
When X is a finite-dimensional space, we get the following essential result.
Proposition 9
Proof
The preceding result shows that γ-strongly convex functions enjoy a very desirable property for generalized Hessian matrices. In fact, in this case, any matrix \(B\in\partial^{2}_{H} f(\bar{x})\) is invertible. The next result proves the converse of Proposition 9. Let us first recall the following characterization of l.s.c. γ-strongly convex functions.
Theorem 2
Amahroq et al. [8]
We are now in position to state our main second result.
Theorem 3
Let \(f: \mathbb{R}^{p} \rightarrow\mathbb{R}\) be a \(C^{1,1}\) function. Assume that \(\partial^{2}_{H} f(\cdot)\) satisfies relation (20) at any \(x\in\mathbb{R}^{p}\). Then f is γ-strongly convex.
Proof
Hiriart-Urruty et al. [19] have presented many examples of \(C^{1,1}\) functions. The next proposition shows another example of a \(C^{1,1}\) function.
Theorem 4
Proof
4 Newton’s method
Theorem 5
Let \(f: \mathbb{R}^{p} \rightarrow\mathbb{R}\) be a Fréchet-differentiable function, and x̄ be a solution of (25). Let \(\varepsilon, r, K >0\) be such that \(\nabla f(\cdot)\) admits \(\beta_{f}(\bar{x})\) as a first-order approximation at x̄ such that, for each \(x\in\mathbb{B}_{\mathbb{R}^{p}} (\bar{x},r)\), there exists an invertible element \(B(x) \in\mathcal{B}_{f}(x)\) satisfying \(\Vert B(x)^{-1} \Vert \leq K\) and \(\xi:= \varepsilon K<1\). Then the sequence \((x_{k})\) generated by Algorithm \((\mathcal {M})\) is well defined for every \(x_{0} \in\mathbb{B}_{\mathbb{R}^{p}}(\bar {x},r)\) and converges linearly to x̄ with rate ξ.
Proof
Theorem 6
Let U be an open set of \(\mathbb{R}^{p}\), \(x_{0}\in U\), and \(f: \mathbb {R}^{p} \rightarrow\mathbb{R}\) be a Fréchet-differentiable function on U. Let \(\varepsilon, r, K >0\) be such that \(\nabla f(\cdot)\) admits \(\beta_{f}(x_{0})\) as a strict first-order approximation at \(x_{0}\) such that, for each \(x\in\mathbb{B}_{\mathbb{R}^{p}} (x_{0},r)\), there exists a right inverse of \(B(x)\in\beta_{f}(x_{0})\), denoted by \(\tilde {B}(x)\), satisfying \(\Vert \tilde{B}(x)(\cdot) \Vert \leq K \Vert \cdot \Vert \) and \(\xi:= \varepsilon K<1\).
If \(\Vert \nabla f(x_{0}) \Vert \leq K^{-1}(1-\xi)r \) and ∇f is continuous, then the sequence \((x_{k})\) generated by Algorithm \((\mathcal {M}')\) is well defined and converges to a solution x̄ of (25). Moreover, we have \(\Vert x_{k}-\bar {x} \Vert \leq r\xi^{k}\) for all \(k\in\mathbb{N}\) and \(\Vert \bar {x}-x_{0} \Vert \leq \Vert \nabla f(x_{0}) \Vert K(1-\xi)^{-1}< r\).
Proof
5 Conclusions
In this paper, we investigate the concept of first- and second-order approximations to generalize some results such as optimality conditions for a subclass of convex functions called strongly convex functions of order γ. We also present an extension of Newton’s method to solve the Euler equation under weak assumptions.
Declarations
Acknowledgements
The author wishes to express his heartfelt thanks to the referees for their detailed and helpful suggestions for revising the manuscript.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Allali, K, Amahroq, T: Second order approximations and primal and dual necessary optimality conditions. Optimization 40(3), 229-246 (1997) MathSciNetView ArticleMATHGoogle Scholar
- Thibault, L: Subdifferentials of compactly Lipschitzian vector-valued functions. Ann. Mat. Pura Appl. (4) 125, 157-192 (1980) MathSciNetView ArticleMATHGoogle Scholar
- Sweetser, TH: A minimal set-valued strong derivative for set-valued Lipschitz functions. J. Optim. Theory Appl. 23, 539-562 (1977) MathSciNetView ArticleMATHGoogle Scholar
- Ioffe, AD: Nonsmooth analysis: differential calculus of nondifferentiable mappings. Trans. Am. Math. Soc. 266, 1-56 (1981) MathSciNetView ArticleMATHGoogle Scholar
- Jourani, A, Thibault, L: Approximations and metric regularity in mathematical programming in Banach space. Math. Oper. Res. 18(2), 390-401 (1993) MathSciNetView ArticleMATHGoogle Scholar
- Amahroq, T, Gadhi, N: On the regularity condition for vector programming problems. J. Glob. Optim. 21(4), 435-443 (2001) MathSciNetView ArticleMATHGoogle Scholar
- Amahroq, T, Gadhi, N: Second order optimality conditions for the extremal problem under inclusion constraints. J. Math. Anal. Appl. 285(1), 74-85 (2003) MathSciNetView ArticleMATHGoogle Scholar
- Amahroq, T, Daidai, I, Syam, A: γ-strongly convex functions, γ-strong monotonicity of their presubdifferential and γ-subdifferentiability, application to nonlinear PDE. J. Nonlinear Convex Anal. (2017, to appear) Google Scholar
- Crouzeix, JP, Ferland, JA, Zalinescu, C: α-convex sets and strong quasiconvexity. SIAM J. Control Optim. 22, 998-1022 (1997) MathSciNetMATHGoogle Scholar
- Lin, GH, Fukushima, M: Some exact penalty results for nonlinear programs and mathematical programs with equilibrium constraints. J. Optim. Theory Appl. 118, 67-80 (2003) MathSciNetView ArticleMATHGoogle Scholar
- Polyak, BT: Introduction to Optimization. Optimization Software, New York (1987). Translated from the Russian, with a foreword by Dimitri P. Bertsekas MATHGoogle Scholar
- Rockafellar, RT: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877-898 (1976) MathSciNetView ArticleMATHGoogle Scholar
- Vial, J-P: Strong convexity of sets and functions. J. Math. Econ. 9(1-2), 187-205 (1982). doi:10.1016/0304-4068(82)90026-X MathSciNetView ArticleMATHGoogle Scholar
- Vial, J-P: Strong and weak convexity of sets and functions. Math. Oper. Res. 8(2), 231-259 (1983) MathSciNetView ArticleMATHGoogle Scholar
- Zălinescu, C: On uniformly convex functions. J. Math. Anal. Appl. 95(2), 344-374 (1983) MathSciNetView ArticleMATHGoogle Scholar
- Mordukhovich, BS, Shao, YH: On nonconvex subdifferential calculus in Banach spaces. J. Convex Anal. 2(1-2), 211-227 (1995) MathSciNetMATHGoogle Scholar
- Clarke, FH: Optimization and Nonsmooth Analysis. Wiley, New York (1983) MATHGoogle Scholar
- Clarke, FH: On the inverse function theorem. Pac. J. Math. 64(1), 97-102 (1976) MathSciNetView ArticleMATHGoogle Scholar
- Hiriart-Urruty, J-B, Strodiot, J-J, Nguyen, VH: Generalized Hessian matrix and second-order optimality conditions for problems with \(C^{1,1}\) data. Appl. Math. Optim. 11(1), 43-56 (1984) MathSciNetView ArticleMATHGoogle Scholar
- Jourani, A: Subdifferentiability and subdifferential monotonicity of γ-paraconvex functions. Control Cybern. 25(4), 721-737 (1996) MathSciNetMATHGoogle Scholar
- Phelps, RR: Convex Functions, Monotone Operators and Differentiability Lecture Notes in Mathematics, vol. 1364. Springer, Berlin (1989) View ArticleMATHGoogle Scholar
- Lebourg, G: Valeur moyenne pour gradient généralisé. C. R. Acad. Sci. Paris 281, 795-797 (1975) MathSciNetMATHGoogle Scholar
- Cominetti, R, Correa, R: A generalized second-order derivative in nonsmooth optimization. SIAM J. Control Optim. 28(4), 789-809 (1990) MathSciNetView ArticleMATHGoogle Scholar