An axiomatic integral and a multivariate mean value theorem
 Milan Merkle^{1}Email author
https://doi.org/10.1186/s1366001508662
© Merkle 2015
Received: 14 July 2015
Accepted: 9 October 2015
Published: 29 October 2015
Abstract
In order to investigate minimal sufficient conditions for an abstract integral to belong to the convex hull of the integrand, we propose a system of axioms under which it happens. If the integrand is a continuous \(\mathbf {R}^{n}\)valued function over a pathconnected topological space, we prove that any such integral can be represented as a convex combination of values of the integrand in at most n points, which yields an ultimate multivariate mean value theorem.
Keywords
MSC
1 Introduction and motivation
It turns out that the difference of one extra point for noncontinuous functions remains in a much more general case and for a very broad class of integrals in higher dimensions. This is the topic of this article.
In multivariate case, with \(X\in \mathbf {R}^{n}\), \(n\geq1\), there is an old and seemingly forgotten mean value theorem by Kowalewski [1, 2] as follows.
Theorem A
[1]
A recent generalization of Theorem A is proved in [3] using the following modification of classical Carathéodory’s convex hull theorem.
Theorem B
[3]
Let \(C: s\mapsto X(s)\), \(s\in I\), be a continuous curve in \(\mathbf {R}^{n}\), where \(I\subset \mathbf {R}\) is an interval, and let K be the convex hull of the curve C. Then each \(v\in K\) can be represented as a convex combination of n or fewer points of the curve C.
In this way, a more general mean value theorem for ndimensional functions is obtained directly from the fact that the normalized integral should belong to the convex hull of the set of values of the integrand. This is proved in the following theorem for Lebesgue integral with a probability measure.
Theorem C
([3], Lemma 1)
Let \((S,{\mathcal{F}},\mu)\) be a probability space, and let \(X_{i}: S\rightarrow \mathbf {R}\), \(i=1,\ldots, n\), be μintegrable functions. Let \(X(t)=(X_{1}(t),\ldots,X_{n}(t))\) for every \(t\in S\). Then \(\int_{S} X(t)\,\mathrm {d}\mu(t)\in \mathbf {R}^{n}\) is in the convex hull of the set \(X(S)=\{ X(t) \mid t\in S\}\subset \mathbf {R}^{n}\).
Finally, the main result of [3] reads as follows.
Theorem D
([3], Theorem 2)
Let us note that without the continuity assumption we still may use Carathéodory’s convex hull theorem which would yield (4) with \(n+1\) points \(t_{i}\) and the same number of \(\lambda _{i}\)’s, which shows that the example at the beginning of the text well describes the general situation in \(\mathbf {R}^{n}\).
In this paper we give a more general theorem of the type (4), tracing the steps of the proof in [3] in a much more general context. In Section 3 we show that a result like (4) holds if the integral over I is replaced with a general linear functional on some function space, under a system of axioms, whereas the interval I can be replaced by a topological space which is path connected.
To reach this goal, we need to extend Theorem C. In Section 2 we show that Theorem C holds for any linear functional which satisfies a condition slightly stronger than positivity, and where functions \(X_{k}\) are defined over an arbitrary nonempty set.
Applications of such a very general mean value theorem are numerous, and we are not discussing particular applications in this paper. Let us just mention that as shown in [3], the theorems of this type can be considered as an aid to construct quadrature rules or their approximative versions. See also a recent paper [4] for another application related to integrals. Another advantage of the approach presented in this paper is that the results are widely applicable to different kinds of integrals treated as linear functionals over some space of functions.
2 Axioms and their consequences
We start with an arbitrary nonempty set S with an algebra \({\mathcal{F}}\) (may be a sigma algebra as well) of its subsets. Therefore, \({\mathcal{F}}\) contains S and if a set A is a result of finitely many set operations over sets in \({\mathcal{F}}\), then \(A\in {\mathcal{F}}\).
 (C1):

If \(X_{1},X_{2} \in {\mathcal{S}}\), then \(aX_{1}+bX_{2}\in {\mathcal{S}}\) for all \(a,b \in \mathbf {R}\).
 (C2):

For \(B\in {\mathcal{F}}\), the indicator function \(I_{B}(\cdot)\) belongs to \({\mathcal{S}}\).
 (C3):

For \(X\in {\mathcal{S}}\) and any interval J, the set \(\{s\in S: X(s)\in J\}\) is in \({\mathcal{F}}\).
 (C4):

For \(X\in {\mathcal{S}}\) such that \(X(s)\geq0\) for all \(s\in S\), it holds that \(X\cdot I_{X\in J}\in {\mathcal{S}}\) for any interval J.
Note that from (C1) and (C2) it follows that all constants are in \({\mathcal{S}}\). Let us also note that functions in \({\mathcal{S}}\) are not assumed to be bounded.
 (A1):

\(\mathrm {E}c = c\) for any constant c;
 (A2):

\(\mathrm {E}( \sum_{i=1}^{m} \alpha_{i} X_{i} ) = \sum_{i=1}^{m} \alpha_{i} \mathrm {E}(X_{i})\) for \(a_{j}\in \mathbf {R}\) and for \(X_{i} \in {\mathcal{S}}\), \(i=1,\ldots, m\).
 (C5):

If \(X\in {\mathcal{S}}\) and \(P(N)=0\), then \(X\cdot I_{N}\in {\mathcal{S}}\), and \(\mathrm {E}(X\cdot I_{N})=0\).
 (A3):

For \(X\in {\mathcal{S}}\), if \(\mathrm {E}X =0\), then either \(P(X=0)=1\) or there exist \(s_{1},s_{2}\in S\) such that \(X(s_{1})X(s_{2})<0\),
 (A3′):

For \(X\in {\mathcal{S}}\), if \(X(s)\geq0\) for all \(s\in S\) and \(\mathrm {E}X =0\), then \(P(X=0)=1\).
The central result of this section is Theorem 2.10, where we show that under (A1)(A2)(A3), \(\mathrm {E}(X)\) belongs to the convex hull of \(X(S)\).
A similar axiomatic approach is applied in Daniell’s integral, and there are other axiomatic systems in the literature for different purposes like in [5] for Riemann integrals in connection to evaluating the length of a curve, general means in [6], finitely additive probabilities (FAPs) in [7] and applications in a recent article [8]. The system of axioms applied in this article differs from others in the literature in the conditions that allow nonabsolute integrals, as well as in axiom (A3), which is a slightly stronger condition than the usual positivity. The reason for introducing this system of axioms is that it provides conditions under which EX belongs to the convex hull of \(X(S)\) (Theorem 2.10) independently of the kind of integrals that are considered.
Now we are going to derive some additional properties as consequences from the axioms.
Lemma 2.1
Under the system of axioms (A1)(A2)(A3) or (A1)(A2)(A3′), assuming also conditions (C1)(C2), the set function P defined on \({\mathcal{F}}\) with (5) is a finitely additive probability on \((S,{\mathcal{F}})\).
Proof
Let us now show that \(P(B)\geq0\) for all \(B\in {\mathcal{F}}\). Indeed, suppose that \(P(B)=\varepsilon \) for some \(\varepsilon >0\). This implies (by (A1) and (A2)) that \(\mathrm {E}(I_{B} +\varepsilon )=0\); on the other hand, \(I_{B}(s) + \varepsilon >0\) for all \(s\in S\), which contradicts both (A3) and (A3′), and this ends the proof. □
From now on, the quintuplet \((S,{\mathcal{F}},{\mathcal{S}},\mathrm {E},P)\) will be assumed to be as defined above in the framework of axioms (A1)(A2)(A3) and conditions (C1)(C5) (if not specified differently). The letter X will be reserved for elements of \({\mathcal{S}}\), and P is a set function derived from E as in (5).
Lemma 2.2
(Positivity)
If for all \(s\in S\), \(X(s)\geq0\), then \(\mathrm {E}X \geq0\).
Proof
Suppose \(X(s)\geq0\) for all \(s\in S\), and \(\mathrm {E}X = \varepsilon \) for some \(\varepsilon >0\). Then, by (A1) and (A2), \(\mathrm {E}(X+\varepsilon ) =0\), whereas \(X+\varepsilon >0\). This contradicts (A3). Therefore, if \(X\geq0\), then \(\mathrm {E}X \geq0\). □
Lemma 2.3
Assuming that (A1) and (A2) hold, axioms (A3) and (A3′) are equivalent.
Proof
In Lemma 2.1 we already proved the property that P is a FAP follows with either (A3) or (A3′), so we may use that property in both parts of the present proof.
Assume that (A1)(A2)(A3) hold and suppose that for all \(s\in S\), \(X(s)\geq 0\) and \(\mathrm {E}X =0\). Then by (A3) it follows that \(P(X=0)=1\). Therefore, (A3′) holds.
Now assume that (A1)(A2)(A3′) hold, but not (A3). Then there exists \(X\in {\mathcal{S}}\) such that: (a) \(\mathrm {E}X =0\), \(P(X=0)<1\), \(X\geq0\), or (b) \(\mathrm {E}X =0\), \(P(X=0)<1\), \(X\leq0\). In the case (a), using (A3′) we find that \(P(X=0)=1\), which is a contradiction to \(P(X=0)<1\). The case (b) can be reduced to (a) with the function \(Y=X\). □
Due to the equivalence established in Lemma 2.3, in the rest of the article we refer to (A3) as being either (A3) or (A3′).
Remark 2.4
In this way we get a new quintuplet with \((S^{\ast}, {\mathcal{F}}^{\ast}, {\mathcal{S}}^{\ast}, \mathrm {E}^{\ast},P^{\ast})\), and it is not difficult to see that the new quintuplet inherits conditions (C1)(C5) and axioms (A1)(A2) as well as (A3) if it is satisfied with the original quintuplet.
In the next lemma we prove Markov’s inequality from the axioms.
Lemma 2.5
Proof
Lemma 2.6
Assuming (A1)(A2), (C1)(C5) and Markov’s inequality, (A3) holds if P is countably additive probability.
Proof
Remark 2.7
Consider the case where P is a countably additive probability on \((S,{\mathcal{F}})\), \({\mathcal{F}}\) is a sigma algebra, and \(\mathrm {E}X = \int_{S} X(s)\,\mathrm {d}P(s)\). Axioms (A1)(A2) and conditions (C1)(C5) are clearly satisfied, and Markov’s inequality can be proved from properties of the integral, so by Lemma 2.6, axiom (A3) also holds.
For every probability P on \({\mathcal{F}}\) there exists a countably additive probability \(P_{c}\) and a purely finitely additive probability \(P_{d}\) such that \(P=\lambda P_{c} + (1\lambda) P_{d}\) for some \(\lambda\in[0,1]\). This decomposition is unique (except for \(\lambda=0\) or \(\lambda=1\), when it is trivially nonunique). For more details see [9].
Lemma 2.8
Assuming axioms (A1)(A2), conditions (C1)(C5) and positivity, if P is a SFAP, the condition of axiom (A3) is not satisfied.
Proof
Example 2.9
Let \(S= [0,+\infty)\) and let P be the probability defined by the nonprincipal ultrafilter of Banach limit as \(s\rightarrow +\infty\). Let \(X(s)=e^{s}\). Then \(X\geq0\) and \(\mathrm {E}X =0\), but \(P(X=0)=0\). In this case the convex hull \(K(X) = (0,1]\) and \(\mathrm {E}X \notin K(X)\).
Theorem 2.10
Let \((S,{\mathcal{F}},{\mathcal{S}},\mathrm {E},P)\) be a quintuplet as defined above, and let \(X = (X_{1},\ldots, X_{n})\), where \(X_{i} \in {\mathcal{S}}\) for all i. Assuming that axioms (A1)(A2)(A3) and conditions (C1)(C5) hold, EX belongs to the convex hull of the set \(X(S)=\{ X(t) \mid t\in S\}\subset \mathbf {R}^{n}\).
Proof
Without loss of generality we may assume that for all i, \(\mathrm {E}X_{i}=0\) (otherwise, if \(\mathrm {E}X_{i} =c_{i}\), we can observe \(\mathrm {E}(X_{i}c_{i}) =0\)). Let K denote the convex hull of the set \(X(S)\in \mathbf {R}^{n}\). We now prove that \(0\in K\) by induction on n. Let \(n=1\). By (A3), \(\mathrm {E}X=0\) implies that either \(X(s)=0\) for some \(s\in S\) or there are \(s_{1},s_{2} \in S\) such that \(X(s_{1})>0\) and \(X(s_{2})<0\). In both cases it follows that \(0\in K\).
Now assume that the statement of the theorem is valid for all dimensions from 1 to \(n1\) for all quintuplets \((S,{\mathcal{F}},{\mathcal{S}},\mathrm {E},P)\) that satisfy the conditions mentioned in the statement of the theorem. Let now X be a vector function with values in \(\mathbf {R}^{n}\).
In other words, a separating hyperplane exists only if there exists a linear relation among n given functions with probability one. In order to eliminate \(X_{n}\) and to reduce the system to \(n1\) functions, we consider functions \(X^{\ast}_{i}(s)=X_{i}(s)\) on the restricted domain \(S^{\ast}=S\setminus N\) (\(i=1,\ldots, n1\)) and the corresponding functional \(\mathrm {E}^{\ast} \). Let \(K^{\ast}\) be convex hull of \(X^{\ast} (S^{\ast})\in \mathbf {R}^{n1}\).
Remark 2.11
Theorem 2.10 provides sufficient conditions for EX to belong to the convex hull of \(X(S)\). However, by inspection of the proof, we can see that axiom (A3) is also necessary, assuming (A1)(A2) and conditions (C1)(C5).
Now, as a corollary to Theorem 2.10, using Carathéodory’s theorem on representation of convex hull in finite dimension, we get the following result.
Theorem 2.12
Theorem 2.12 is the most general mean value theorem for axiomatic integral. Due to Remark 2.7, the statement of this theorem applies with \(\mathrm {E}X_{i} = \int_{S} X_{i}(s)\,\mathrm {d}\mu(s)\), μ is a countably additive probability measure on \((S,{\mathcal{F}})\) where \({\mathcal{F}}\) is a sigma algebra, and \(X_{i}\) are \((S,{\mathcal{F}})(\mathbf {R}, {\mathcal{B}})\) measurable and integrable functions (\({\mathcal{B}}\) is the Borel sigmafield on R).
In the next section we consider the case of continuous functions \(X_{i}\).
3 Mean value theorem for continuous multivariate mappings
Definition 3.1
A path from a point a to a point b in a topological space S is a continuous mapping \(f: [0,1]\rightarrow S\) such that \(f(0)=a\) and \(f(1)=b\). A space S is path connected if for any two points \(a,b \in S\) there exists a path that connects them.
Let us remark that any topological vector space is path connected. A path that connects points a and b is given by \(f(\lambda)=\lambda a + (1\lambda)b\), \(\lambda\in[0,1]\). The same is true for a convex subset S of any topological space.
The following result is a generalization of Theorem B.
Theorem 3.2
Let \(X: t\mapsto X(t)\), \(t\in S\), be a continuous function defined on a pathconnected topological space S with values in \(\mathbf {R}^{n}\), and let K be the convex hull of the set \(X(S)\). Then each \(v\in K\) can be represented as a convex combination of n or fewer points of the set \(X(S)\).
Proof
As a direct corollary to Theorems 2.10 and 3.2, we have the following mean value theorem for continuous multivariate mappings.
Theorem 3.3
Remark 3.4
Since Theorem 3.2 is independent of axioms and conditions of Section 2, the statement of Theorem 3.3 holds whenever Theorem 2.12 holds. In particular, it holds whenever \(\mathrm {E}X_{i} = \int_{S} X_{i}(s)\,\mathrm {d}\mu(s)\), where μ is a countably additive probability measure.
In fact, all what Theorem 3.3 says is that we can save one point in the representation (12) of Theorem 2.12. Although it might look not much significant, in some applications it makes difference. For example, Karamata’s representation for covariance in [11] based on Kowalewski’s original result for \(n=2\) strongly depends on two points and nothing similar can be derived with three points.
4 Some particular cases and open problems
4.1 Riemann and Lebesgue integral on \(\mathbf {R}^{d}\)
In words, this result shows that for an arbitrary system of n integrals with continuous integrands, there exists an exact quadrature rule with n points in D with coefficients \(\lambda _{i}\) which are the same for all integrals. Note that \(x_{i}\) are ddimensional points, so in fact here we have dn scalar parameters.
4.2 Integrals with respect to countably additive probability measure
4.3 Open question
We showed in Section 2 that axiom (A3) does not hold for (integrals based on) SFAPs, so by Remark 2.11, the mean value theorem does not hold in this case. It would be of interest to describe classes of finitely additive probabilities for which axiom (A3) holds or does not hold in terms of some structural properties of measures.
Declarations
Acknowledgements
This work is supported by grant III 44006 from the Ministry of Education, Science and Technological Development of Republic of Serbia.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Kowalewski, G: Ein Mittelwertsatz für ein System von n Integralen. Zeitschrf. für Math. und Phys. (Schlömilch Z.) 42, 153157 (1895) MATHGoogle Scholar
 Kowalewski, G: Bemerkungen zu dem Mittelwertsatze für ein System von n Integralen. Zeitschrf. für Math. und Phys. (Schlömilch Z.) 43, 118120 (1896) MATHGoogle Scholar
 Janković, S, Merkle, M: A mean value theorem for systems of integrals. J. Math. Anal. Appl. 342, 334339 (2008) MATHMathSciNetView ArticleGoogle Scholar
 Heydari, M, Avazzdeh, Z, Navabpour, H, Loghmani, GB: Numerical solution of Fredholm integral equations of the second kind by using integral mean value theorem II. High dimensional problems. Appl. Math. Model. 37, 432442 (2013) MathSciNetView ArticleGoogle Scholar
 Gillman, L: An axiomatic approach to the integral. Am. Math. Mon. 100(1), 1625 (1993) MATHMathSciNetView ArticleGoogle Scholar
 Ostrowski, AM: On an integral inequality. Aequ. Math. 4, 358373 (1970) MATHMathSciNetView ArticleGoogle Scholar
 Schervish, MJ, Seidenfeld, T, Kadane, JB: On the equivalence of conglomerability and disintegrability for unbounded random variables. Technical Report 864, Carnegie Mellon University, Pittsburgh, PA (2008) Google Scholar
 Schervish, MJ, Seidenfeld, T, Kadane, JB: Dominating countably many forecasts. Ann. Stat. 42, 728756 (2014) MATHMathSciNetView ArticleGoogle Scholar
 Yosida, K, Hewitt, E: Finitely additive measures. Trans. Am. Math. Soc. 72, 4666 (1952) MATHMathSciNetView ArticleGoogle Scholar
 Rockafellar, RT: Convex Analysis. Princeton University Press, Princeton (1997) MATHGoogle Scholar
 Karamata, J: Sur certain inégalités relatives aux quotients et à la difference de \(\int fg\) et \(\int f\int g\). Publ. Inst. Math. (Belgr.) 2, 131145 (1948) MathSciNetGoogle Scholar
 Frink, O Jr: Jordan measure and Riemann integration. Ann. Math. 34, 518526 (1933) MathSciNetView ArticleGoogle Scholar