Approximations of Jensen divergence for twice differentiable functions
© Kikianty et al.; licensee Springer 2013
Received: 16 November 2012
Accepted: 9 May 2013
Published: 28 May 2013
The Jensen divergence is used to measure the difference between two probability distributions. This divergence has been generalised to allow the comparison of more than two distributions. In this paper, we consider some bounds for generalised Jensen divergence of twice differentiable functions with bounded second derivatives. Evidently, these bounds provide approximations for the Jensen divergence of twice differentiable functions by the Jensen divergence of simpler functions such as the power functions and the paired entropies associated to the Harvda-Charvát functions.
Keywordsdivergence measure Jensen divergence inequality for real numbers
One of the more important applications of probability theory is finding an appropriate measure of distance (or difference) between two probability distributions . A number of these divergence measures have been widely studied and applied by a number of mathematicians such as Burbea and Rao , Havrda and Charvát , Lin  and others.
for all . Several measures have been proposed to quantify the difference (also known as the divergence) between two (or more) probability distributions. We refer to Grosse et al. , Kullback and Leibler  and Csiszar  for further references.
that can be defined on with the convention that for . We note that the divergence is also known as the Jensen-Shannon divergence .
These measures have been applied in a variety of fields, for example, in information theory . The Jensen divergence introduced in Burbea and Rao  has its applications in bioinformatics [10, 11], where it is usually utilised to compare two samples of healthy population (control) and diseased population (case) in detecting gene expression for a certain disease. We refer the readers to Dragomir  for the applications in other areas.
In a recent paper by Dragomir et al. , the authors found sharp upper and lower bounds for the Jensen divergence for various classes of functions Φ, including functions of bounded variation, absolutely continuous functions, Lipschitzian continuous functions, convex functions and differentiable functions. We recall some of these results in Section 2, which motivates the new results we obtain in this paper.
In this paper, we provide bounds for Jensen divergence of twice differentiable function Φ whose second derivative satisfies some boundedness conditions. These bounds provide approximations of the Jensen divergence (cf. (1)) by the divergence of simpler functions such as the power functions (cf. Section 3) and the above mentioned family of Jensen divergences (cf. Section 4). Finally, we apply these bounds to some elementary functions in Section 5.
2 Definitions, notation and previous results
In this section, we provide definitions and notation that will be used in the paper. We also provide some results regarding sharp bounds for the generalised Jensen divergence as stated in Dragomir et al. .
2.1 Definitions and notation
Throughout the paper, for any real number , we define to be its Hölder conjugate, that is, .
Definition 1 (Bullen )
for and . This mean generalises not only logarithmic mean (when ), which is particularly useful in distribution of electrical charge of a conductor, but also arithmetic mean (when ) and geometric mean (when ).
and for , we denote .
We recall that a function is absolutely continuous on if and only if it is differentiable almost everywhere in , the derivative is Lebesgue integrable on this interval and for any .
2.2 Previous results
In a recent paper by Dragomir et al. , the authors provide sharp upper and lower bounds for the Jensen divergence for various classes of functions Φ. Some results are stated in the following.
Theorem 2 (Dragomir et al. )
for any .
for any , where .
The constant is best possible in both inequalities.
Some more assumptions for Φ lead to the following results.
Theorem 3 (Dragomir et al. )
- (i)If the derivative is of bounded variation on , then(5)
for any .
- (ii)If the derivative is K-Lipschitzian on with the constant , then(6)
The constant is best possible in (6).
Motivated by these results, we state bounds for for twice differentiable functions Φ with some boundedness conditions for the second derivative in the next sections.
3 Approximating with Jensen divergence for power functions
Lemma 4 (Dragomir et al. )
We refer to  for the proof of the above lemma.
where is the sth generalised logarithmic mean.
which proves (11). □
as desired. □
We omit the proofs for the next results as they follow similarly to those of Lemma 5 and Theorem 6.
where is the sth generalised logarithmic mean.
4 Further approximations
and the first inequality in (18) is proved. To prove the second inequality in (18), we consider the auxiliary function with , for which we perform a similar argument; and we omit the details.
The second inequality in (19) follows by considering the auxiliary function with , and we omit the details. This completes the proof. □
and the first inequality in (21) is proved. Now, consider the auxiliary function with . Then for any ; and by (20) it is a convex function on I. By similar arguments, we deduce the second inequality in (21).
To prove the second part of the theorem, consider the auxiliary function , . We observe that is twice differentiable and , for . Since by (22) we have for all , then we can conclude that is a convex function on I. The proof now follows along the lines outlined above and the first part of (23) is proved. The second part of (23) also follows by employing the auxiliary function , ; and this completes the proof. □
5 Applications to some elementary functions
We consider the approximations mentioned in Section 4 for some elementary functions.
Discussion In this example, the best lower approximation (amongst the three) is given by , and the best upper approximation is given by , where and . However, it remains an open question whether this is true in general.
for and .
Discussion In this example, the best lower approximation (amongst the five) is given by , and the best upper approximation is given by , where , . However, it remains an open question whether this is true in general.
- Dragomir SS: Some reverses of the Jensen inequality with applications. RGMIA Research Report Collection (Online) 2011., 14: Article ID v14a72Google Scholar
- Burbea J, Rao CR: On the convexity of some divergence measures based on entropy functions. IEEE Trans. Inf. Theory 1982, 28(3):489–495. 10.1109/TIT.1982.1056497MathSciNetView ArticleGoogle Scholar
- Havrda ME, Charvát F: Quantification method of classification processes: concept of structural α -entropy. Kybernetika 1967, 3: 30–35.MathSciNetGoogle Scholar
- Lin J: Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37(1):145–151. 10.1109/18.61115View ArticleGoogle Scholar
- Grosse I, Bernaola-Galvan P, Carpena P, Roman-Roldan R, Oliver J, Stanley HE: Analysis of symbolic sequences using the Jensen-Shannon divergence. Phys. Rev. E, Stat. Nonlinear Soft Matter Phys. 2002., 65(4): Article ID 041905. doi:10.1103/PhysRevE.65.041905Google Scholar
- Kullback S, Leibler RA: On information and sufficiency. Ann. Math. Stat. 1951, 22: 79–86. 10.1214/aoms/1177729694MathSciNetView ArticleGoogle Scholar
- Csiszar I: Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hung. 1967, 2: 299–318.MathSciNetGoogle Scholar
- Shannon CE: A mathematical theory of communications. Bell Syst. Tech. J. 1948, 27: 379–423. 623–565MathSciNetView ArticleGoogle Scholar
- Menendez ML, Pardo JA, Pardo L: Some statistical applications of generalized Jensen difference divergence measures for fuzzy information systems. Fuzzy Sets Syst. 1992, 52: 169–180. 10.1016/0165-0114(92)90047-8MathSciNetView ArticleGoogle Scholar
- Arvey AJ, Azad RK, Raval A, Lawrence JG: Detection of genomic islands via segmental genome heterogeneity. Nucleic Acids Res. 2009, 37(16):5255–5266. 10.1093/nar/gkp576View ArticleGoogle Scholar
- Gómez RM, Rosso OA, Berretta R, Moscato P: Uncovering molecular biomarkers that correlate cognitive decline with the changes of Hippocampus’ gene expression profiles in Alzheimer’s disease. PLoS ONE 2010., 5(4): Article ID e10153. doi:10.1371/journal.pone.0010153Google Scholar
- Dragomir SS, Dragomir NM, Sherwell D: Sharp bounds for the Jensen divergence with applications. RGMIA Research Report Collection (Online) 2011., 14: Article ID v14a47Google Scholar
- Bullen PS Mathematics and Its Applications 560. In Handbook of Means and Their Inequalities. Kluwer Academic, Dordrecht; 2003.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.