Approximations of Jensen divergence for twice differentiable functions

Kikianty, Eder; Dragomir, Sever S; Dintoe, Isia T; Sherwell, David

doi:10.1186/1029-242X-2013-267

Research
Open access
Published: 28 May 2013

Approximations of Jensen divergence for twice differentiable functions

Eder Kikianty¹,
Sever S Dragomir^1,2,
Isia T Dintoe¹ &
…
David Sherwell¹

Journal of Inequalities and Applications volume 2013, Article number: 267 (2013) Cite this article

2551 Accesses
Metrics details

Abstract

The Jensen divergence is used to measure the difference between two probability distributions. This divergence has been generalised to allow the comparison of more than two distributions. In this paper, we consider some bounds for generalised Jensen divergence of twice differentiable functions with bounded second derivatives. Evidently, these bounds provide approximations for the Jensen divergence of twice differentiable functions by the Jensen divergence of simpler functions such as the power functions and the paired entropies associated to the Harvda-Charvát functions.

MSC:26D15, 94A17.

1 Introduction

One of the more important applications of probability theory is finding an appropriate measure of distance (or difference) between two probability distributions [1]. A number of these divergence measures have been widely studied and applied by a number of mathematicians such as Burbea and Rao [2], Havrda and Charvát [3], Lin [4] and others.

In Burbea and Rao [2], a generalisation of Jensen divergence is considered to allow the comparison of more than two distributions. If Φ is a function defined on an interval I of the real line ℝ, the (generalised) Jensen divergence between two elements $x = (x_{1}, \dots, x_{n})$ and $y = (y_{1}, \dots, y_{n})$ in $I^{n}$ (where $n \geq 1$ ) is given by the following equation (cf. Burbea and Rao [2]):

J_{n, Φ} (x, y) : = \sum_{i = 1}^{n} [\frac{1}{2} [Φ (x_{i}) + Φ (y_{i})] - Φ (\frac{x_{i} + y_{i}}{2})]

(1)

for all $x, y \in I^{n} \times I^{n}$ . Several measures have been proposed to quantify the difference (also known as the divergence) between two (or more) probability distributions. We refer to Grosse et al. [5], Kullback and Leibler [6] and Csiszar [7] for further references.

We denote by $S_{n}$

S_{n} = {(x_{1}, \dots, x_{n}) \in I^{n}, \sum_{i = 1}^{n} x_{i} = 1}, I = [0, 1] .

Utilising the family of functions, for $α \in R_{+}$ ,

Φ_{α} (t) : = {\begin{matrix} {(α - 1)}^{- 1} (t^{α} - t), & α \neq 1, \\ t log t, & α = 1, \end{matrix}

by Havrda and Charvát in [3] to introduce their entropies of degree α, Burbea and Rao [2] introduced the following family of Jensen divergences:

J_{n, α} : = {\begin{matrix} {(α - 1)}^{- 1} \sum_{i = 1}^{n} [\frac{1}{2} (x_{i}^{α} + y_{i}^{α}) - {(\frac{x_{i} + y_{i}}{2})}^{α}], & α \neq 1, \\ \frac{1}{2} \sum_{i = 1}^{n} [x_{i} log x_{i} + y_{i} log y_{i} - (x_{i} + y_{i}) log (\frac{x_{i} + y_{i}}{2})], & α = 1, \end{matrix}

that can be defined on $S_{n} \times S_{n}$ with the convention that $0 log 0 = 0$ for $α \in R_{+}$ . We note that the divergence $J_{n, 1}$ is also known as the Jensen-Shannon divergence [8].

These measures have been applied in a variety of fields, for example, in information theory [9]. The Jensen divergence introduced in Burbea and Rao [2] has its applications in bioinformatics [10, 11], where it is usually utilised to compare two samples of healthy population (control) and diseased population (case) in detecting gene expression for a certain disease. We refer the readers to Dragomir [1] for the applications in other areas.

In a recent paper by Dragomir et al. [12], the authors found sharp upper and lower bounds for the Jensen divergence for various classes of functions Φ, including functions of bounded variation, absolutely continuous functions, Lipschitzian continuous functions, convex functions and differentiable functions. We recall some of these results in Section 2, which motivates the new results we obtain in this paper.

In this paper, we provide bounds for Jensen divergence of twice differentiable function Φ whose second derivative $Φ^{″}$ satisfies some boundedness conditions. These bounds provide approximations of the Jensen divergence $J_{n, Φ}$ (cf. (1)) by the divergence of simpler functions such as the power functions (cf. Section 3) and the above mentioned family of Jensen divergences $J_{n, α}$ (cf. Section 4). Finally, we apply these bounds to some elementary functions in Section 5.

2 Definitions, notation and previous results

In this section, we provide definitions and notation that will be used in the paper. We also provide some results regarding sharp bounds for the generalised Jensen divergence as stated in Dragomir et al. [12].

2.1 Definitions and notation

Throughout the paper, for any real number $r > 1$ , we define $r^{'}$ to be its Hölder conjugate, that is, $1 / r + 1 / r^{'} = 1$ .

Definition 1 (Bullen [13])

If s is an extended real number, the generalised logarithmic mean of order s of two positive numbers x and y is defined by

L^{[s]} (x, y) = {\begin{matrix} {[\frac{1}{s + 1} (\frac{y^{s + 1} - x^{s + 1}}{y - x})]}^{\frac{1}{s}} & if s \neq - 1, 0, \pm \infty, \\ \frac{y - x}{log y - log x} & if s = - 1, \\ \frac{1}{e} {(\frac{y^{y}}{x^{x}})}^{\frac{1}{y - x}} & if s = 0, \\ max {x, y} & if s = + \infty, \\ min {x, y} & if s = - \infty, \end{matrix}

(2)

and $L^{[s]} (x, x) = x$ .

This mean is homogeneous and symmetric [[13], p.385]. In particular, there is no loss in generality by assuming $0 < x < y$ . Note also that

L^{[s]} (x, y) = {(\int_{x}^{y} {[(1 - t) x + t y]}^{s} d t)}^{1 / s}

for $0 < x < y$ and $s \in [1, \infty)$ . This mean generalises not only logarithmic mean (when $s = - 1$ ), which is particularly useful in distribution of electrical charge of a conductor, but also arithmetic mean (when $s = 1$ ) and geometric mean (when $s = - 2$ ).

We use the following notations for Lebesgue integrable functions: for any Lebesgue integrable function g on $[a, b]$ , we define, for $a \leq x \leq y \leq b$ ,

{∥ g ∥}_{[x, y], p} : = | \int_{x}^{y} | g (s) |^{p} d s |^{1 / p} if p \geq 1 and g \in L_{p} [a, b];

and for $g \in L_{\infty} [a, b]$ , we denote ${∥ g ∥}_{[x, y], \infty} : = {ess sup}_{s \in [x, y]} | g (s) |$ .

We recall that a function $f : [a, b] \to R$ is absolutely continuous on $[a, b]$ if and only if it is differentiable almost everywhere in $[a, b]$ , the derivative $f^{'}$ is Lebesgue integrable on this interval and $f (y) - f (x) = \int_{x}^{y} f^{'} (t) d t$ for any $x, y \in [a, b]$ .

2.2 Previous results

In a recent paper by Dragomir et al. [12], the authors provide sharp upper and lower bounds for the Jensen divergence for various classes of functions Φ. Some results are stated in the following.

Theorem 2 (Dragomir et al. [12])

Assume that $Φ : [a, b] \to R$ is absolutely continuous on $[a, b]$ . Then we have the bounds

\begin{aligned} | J_{n, Φ} (x, y) | & \leq \frac{1}{2} \times {\begin{matrix} \sum_{i = 1}^{n} | y_{i} - x_{i} | {∥ Φ^{'} ∥}_{[x_{i}, y_{i}], \infty} & if Φ^{'} \in L_{\infty} [a, b], \\ \sum_{i = 1}^{n} | y_{i} - x_{i} |^{\frac{p - 1}{p}} {∥ Φ^{'} ∥}_{[x_{i}, y_{i}], p} & if Φ^{'} \in L_{p} [a, b], p > 1, \\ \sum_{i = 1}^{n} {∥ Φ^{'} ∥}_{[x_{i}, y_{i}], 1} \end{matrix} \\ \leq \frac{1}{2} \times {\begin{matrix} {∥ Φ^{'} ∥}_{[a, b], \infty} \sum_{i = 1}^{n} | y_{i} - x_{i} | & if Φ^{'} \in L_{\infty} [a, b], \\ {∥ Φ^{'} ∥}_{[a, b], p} \sum_{i = 1}^{n} | y_{i} - x_{i} |^{\frac{p - 1}{p}} & if Φ^{'} \in L_{p} [a, b], p > 1, \\ n {∥ Φ^{'} ∥}_{[a, b], 1} \end{matrix} \end{aligned}

(3)

for any $x = (x_{1}, \dots, x_{n}), y = (y_{1}, \dots, y_{n}) \in {[a, b]}^{n}$ .

Moreover, if the modulus of the derivative is convex, then we have the inequality

\begin{aligned} | J_{n, Φ} (x, y) | & \leq \frac{1}{4} \sum_{i = 1}^{n} | y_{i} - x_{i} | [| Φ^{'} (\frac{x_{i} + y_{i}}{2}) | + \frac{| Φ^{'} (x_{i}) | + | Φ^{'} (y_{i}) |}{2}] \\ \leq \frac{1}{4} \sum_{i = 1}^{n} | y_{i} - x_{i} | [| Φ^{'} (x_{i}) | + | Φ^{'} (y_{i}) |] \\ (\leq {∥ Φ^{'} ∥}_{[a, b], \infty} δ (x, y)) \end{aligned}

(4)

for any $x = (x_{1}, \dots, x_{n}), y = (y_{1}, \dots, y_{n}) \in {[a, b]}^{n}$ , where $δ (x, y) = \frac{1}{2} \sum_{i = 1}^{n} | y_{i} - x_{i} |$ .

The constant $1 / 4$ is best possible in both inequalities.

Some more assumptions for Φ lead to the following results.

Theorem 3 (Dragomir et al. [12])

Let $Φ : [a, b] \to R$ be a differentiable function on the interval $[a, b]$ of real numbers ℝ.

(i)
If the derivative $Φ^{'}$ is of bounded variation on $[a, b]$ , then
$\begin{aligned} | J_{n, Φ} (x, y) | & \leq \frac{1}{4} \sum_{i = 1}^{n} | y_{i} - x_{i} | | ⋁_{x_{i}}^{y_{i}} (Φ^{'}) | \\ \leq \frac{1}{4} ⋁_{a}^{b} (Φ^{'}) \sum_{i = 1}^{n} | y_{i} - x_{i} | \\ = \frac{1}{2} ⋁_{a}^{b} (Φ^{'}) δ (x, y) \end{aligned}$
(5)

for any $x = (x_{1}, \dots, x_{n}), y = (y_{1}, \dots, y_{n}) \in {[a, b]}^{n}$ .

The constant $1 / 4$ is best possible in both inequalities (5).

(ii)
If the derivative $Φ^{'}$ is K-Lipschitzian on $[a, b]$ with the constant $K > 0$ , then
$| J_{n, Φ} (x, y) | \leq \frac{1}{8} K \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2} = \frac{1}{2} K J_{n, 2} (x, y)$
(6)

for any $x = (x_{1}, \dots, x_{n}), y = (y_{1}, \dots, y_{n}) \in {[a, b]}^{n}$ , where

J_{n, 2} (x, y) = \frac{1}{4} \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2} .

The constant $1 / 8$ is best possible in (6).

Motivated by these results, we state bounds for $J_{n, Φ}$ for twice differentiable functions Φ with some boundedness conditions for the second derivative in the next sections.

3 Approximating with Jensen divergence for power functions

In this section we provide some bounds for the generalised Jensen divergence for twice differentiable function $Φ : I \subset R \to R$ , whose second derivative $Φ^{″}$ is bounded above and below in the following sense:

γ \leq \frac{t^{2 - p}}{p (p - 1)} Φ^{″} (t) \leq Γ

(7)

for some $γ < Γ$ and $p \in (- \infty, 0) \cup (1, \infty)$ and all $t \in I$ ; and

δ \leq \frac{t^{2 - q}}{q (q - 1)} Φ^{″} (t) \leq Δ

(8)

for some $δ \leq Δ$ , some $q \in (0, 1)$ and all $t \in I$ . These conditions enable us to provide approximations of the Jensen divergence for Φ via the functions $f (t) = t^{p}$ for $p \neq 0, 1$ and $t \in R_{+}$ , i.e.

J_{n, {(\cdot)}^{p}} = \sum_{i = 1}^{n} [\frac{1}{2} (x_{i}^{p} + y_{i}^{p}) - {(\frac{x_{i} + y_{i}}{2})}^{p}] .

Lemma 4 (Dragomir et al. [12])

Let $Φ : [a, b] \to R$ be a differentiable function and let the derivative $Φ^{'}$ be absolutely continuous. Then

\begin{aligned} | J_{n, Φ} (x, y) | \\ \leq {\begin{matrix} \frac{1}{8} {∥ Φ^{″} ∥}_{[a, b], \infty} \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2} & if Φ^{″} \in L_{\infty} [a, b], \\ \frac{{∥ Φ^{″} ∥}_{[a, b], r}}{{(r^{'} + 1)}^{1 / r^{'}} 2^{1 + 1 / r^{'}}} \sum_{i = 1}^{n} {| y_{i} - x_{i} |}^{1 + 1 / r^{'}} & if Φ^{″} \in L_{r} [a, b], r > 1 . \end{matrix} \end{aligned}

(9)

We refer to [12] for the proof of the above lemma.

Lemma 5 Let $Φ : [a, b] \to R$ be a twice differentiable function and $0 < a < b < \infty$ . If $Φ^{″}$ satisfies (7), then

{∥ {(Φ - \frac{γ + Γ}{2} {(\cdot)}^{p})}^{″} ∥}_{[a, b], \infty} \leq p (p - 1) \frac{Γ - γ}{2} max {a^{p - 2}, b^{p - 2}};

(10)

and

{∥ {(Φ - \frac{γ + Γ}{2} {(\cdot)}^{p})}^{″} ∥}_{[a, b], r} \leq p (p - 1) \frac{Γ - γ}{2} {(L^{[(p - 2) r]} (a, b))}^{p - 2}, r > 1,

(11)

where $L^{[s]}$ is the sth generalised logarithmic mean.

Proof Note that condition (7) is equivalent to

γ p (p - 1) t^{p - 2} \leq Φ^{″} (t) \leq Γ p (p - 1) t^{p - 2}

since $p (p - 1) > 0$ . This is also equivalent to

| Φ^{″} (t) - p (p - 1) \frac{γ + Γ}{2} t^{p - 2} | \leq p (p - 1) \frac{Γ - γ}{2} t^{p - 2} .

(12)

We take the supremum of both sides to obtain (10). For $r > 1$ , we note that (12) is equivalent to

\begin{aligned} {∥ Φ^{″} (t) - p (p - 1) \frac{γ + Γ}{2} t^{p - 2} ∥}_{[a, b], r} & = {(\int_{a}^{b} | Φ^{″} (t) - p (p - 1) \frac{γ + Γ}{2} t^{p - 2} |^{r} d t)}^{1 / r} \\ \leq p (p - 1) \frac{Γ - γ}{2} {(\int_{a}^{b} t^{r (p - 2)} d t)}^{1 / r} \\ = p (p - 1) \frac{Γ - γ}{2} {(L^{[r (p - 2)]} (a, b))}^{p - 2}, \end{aligned}

which proves (11). □

Theorem 6 Let $Φ : [a, b] \to R$ be a twice differentiable function and $0 < a < b < \infty$ . If $Φ^{″}$ satisfies (7), then

\begin{aligned} | J_{n, Φ} (x, y) - \frac{γ + Γ}{2} J_{n, {(\cdot)}^{p}} (x, y) | \\ \leq {\begin{matrix} \frac{1}{16} p (p - 1) (Γ - γ) max {a^{p - 2}, b^{p - 2}} \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2} & if Φ^{″} \in L_{\infty} [a, b], \\ \frac{p (p - 1) (Γ - γ)}{{(r^{'} + 1)}^{1 / r^{'}} 2^{2 + 1 / r^{'}}} L^{[(p - 2) r]} (a, b) \sum_{i = 1}^{n} {| y_{i} - x_{i} |}^{1 + 1 / r^{'}} & if Φ^{″} \in L_{r} [a, b], r > 1 . \end{matrix} \end{aligned}

Proof Since any differentiable function is absolutely continuous, we may employ Lemma 4. Combining this with Lemma 5, we have

\begin{aligned} | J_{n, Φ} (x, y) - \frac{γ + Γ}{2} J_{n, {(\cdot)}^{p}} (x, y) | \\ \leq {\begin{matrix} \frac{1}{8} {∥ {(Φ - \frac{γ + Γ}{2} {(\cdot)}^{p})}^{″} ∥}_{[a, b], \infty} \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2}, \\ \frac{1}{{(r^{'} + 1)}^{1 / r^{'}} 2^{1 + 1 / r^{'}}} {∥ {(Φ - \frac{γ + Γ}{2} {(\cdot)}^{p})}^{″} ∥}_{[a, b], r} \sum_{i = 1}^{n} {| y_{i} - x_{i} |}^{1 + 1 / r^{'}}, \end{matrix} \\ \leq {\begin{matrix} \frac{1}{8} p (p - 1) \frac{Γ - γ}{2} max {a^{p - 2}, b^{p - 2}} \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2}, \\ \frac{1}{{(r^{'} + 1)}^{1 / r^{'}} 2^{1 + 1 / r^{'}}} p (p - 1) \frac{Γ - γ}{2} {(L^{[(p - 2) r]} (a, b))}^{p - 2} \sum_{i = 1}^{n} {| y_{i} - x_{i} |}^{1 + 1 / r^{'}}, \end{matrix} \end{aligned}

as desired. □

We omit the proofs for the next results as they follow similarly to those of Lemma 5 and Theorem 6.

Lemma 7 Let $Φ : [a, b] \to R$ be a twice differentiable function and $0 < a < b < \infty$ . If $Φ^{″}$ satisfies (8), then

{∥ {(Φ - \frac{δ + Δ}{2} {(\cdot)}^{q})}^{″} ∥}_{[a, b], \infty} \leq q (1 - q) \frac{Δ - δ}{2} max {a^{q - 2}, b^{q - 2}};

(13)

and

{∥ {(Φ - \frac{δ + Δ}{2} {(\cdot)}^{q})}^{″} ∥}_{[a, b], q} \leq q (1 - q) \frac{Δ - δ}{2} {(L^{[r (q - 2)]} (a, b))}^{p - 2}, r > 1,

(14)

where $L^{[s]}$ is the sth generalised logarithmic mean.

Theorem 8 Let $Φ : [a, b] \to R$ be a twice differentiable function and $0 < a < b < \infty$ . If $Φ^{″}$ satisfies (8), then

\begin{aligned} | J_{n, Φ} (x, y) - \frac{δ + Δ}{2} J_{n, {(\cdot)}^{q}} (x, y) | \\ \leq {\begin{matrix} \frac{1}{16} q (1 - q) (Δ - δ) max {a^{q - 2}, b^{q - 2}} \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2} & if Φ^{″} \in L_{\infty} [a, b], \\ \frac{q (1 - q) (Δ - δ)}{{(r^{'} + 1)}^{1 / r^{'}} 2^{2 + 1 / r^{'}}} {(L^{[(q - 2) r]} (a, b))}^{p - 2} \sum_{i = 1}^{n} {| y_{i} - x_{i} |}^{1 + 1 / r^{'}} & if Φ^{″} \in L_{r} [a, b], r > 1 . \end{matrix} \end{aligned}

4 Further approximations

In this section, we present approximations for $J_{n, Φ}$ by utilising the family of the Jensen divergence

J_{n, α} : = {(α - 1)}^{- 1} \sum_{i = 1}^{n} [\frac{1}{2} (x_{i}^{α} + y_{i}^{α}) - {(\frac{x_{i} + y_{i}}{2})}^{α}], α \neq 1;

(15)

and

J_{n, 1} (x, y) : = \frac{1}{2} \sum_{i = 1}^{n} [x_{i} log x_{i} + y_{i} log y_{i} - (x_{i} + y_{i}) log (\frac{x_{i} + y_{i}}{2})] .

(16)

Although $J_{n, α}$ is defined for $α \in R_{+}$ in [2], we may let α to be negative in (15), and for $α = 0$ , we define

J_{n, 0} (x, y) : = \sum_{i = 1}^{n} [log (\frac{x_{i} + y_{i}}{2}) - \frac{1}{2} (log x_{i} + log y_{i})] .

(17)

Theorem 9 Let $Φ : I \subset (0, \infty) \to R$ be a twice differentiable function on I. If $Φ^{″}$ satisfies (7), then

γ (p - 1) J_{n, p} (x, y) \leq J_{n, Φ} (x, y) \leq Γ (p - 1) J_{n, p} (x, y) for any x, y \in I^{n} .

(18)

Furthermore, if $Φ^{″}$ satisfies (8), then

δ (q - 1) J_{n, q} (x, y) \geq J_{n, Φ} (x, y) \geq Δ (q - 1) J_{n, q} (x, y) for any x, y \in I^{n} .

(19)

Proof We consider the auxiliary function $g_{γ, p} : I \to R$ defined by $g_{γ, p} (t) = Φ (t) - γ t^{p}$ , where $p \in (- \infty, 0) \cup (1, \infty)$ . We observe that $g_{γ, p}$ is twice differentiable on I and the second derivative is given by

g_{γ, p}^{''} (t) = p (p - 1) t^{p - 2} [\frac{t^{2 - p}}{p (p - 1)} Φ^{''} (t) - γ] for any t \in I .

Utilising condition (7) and since $p (p - 1) t^{p - 2} > 0$ for $t \in I$ , we deduce that $g_{γ, p}^{''} (t) \geq 0$ for any $t \in I$ which means that $g_{γ, p}$ is convex on I. Since for a convex function $g : I \to R$ we have that $J_{n, g} (x, y) \geq 0$ , then we can write that

\begin{aligned} 0 & \leq J_{n, g_{γ, p}} (x, y) = \sum_{i = 1}^{n} [\frac{g_{γ, p} (x_{i}) + g_{γ, p} (y_{i})}{2} - g_{γ, p} (\frac{x_{i} + y_{i}}{2})] \\ = \sum_{i = 1}^{n} [\frac{Φ (x_{i}) + Φ (y_{i})}{2} - Φ (\frac{x_{i} + y_{i}}{2})] - γ \sum_{i = 1}^{n} [\frac{x_{i}^{p} + y_{i}^{p}}{2} - {(\frac{x_{i} + y_{i}}{2})}^{p}] \\ = J_{n, Φ} (x, y) - γ (p - 1) J_{n, p} (x, y), \end{aligned}

and the first inequality in (18) is proved. To prove the second inequality in (18), we consider the auxiliary function $g_{Γ, p} (x, y) : I \to R$ with $g_{Γ, p} (x, y) = Γ t^{p} - Φ (t)$ , for which we perform a similar argument; and we omit the details.

Now, if $q \in (0, 1)$ and if we consider the auxiliary function $ψ_{δ, q} (x, y) : I \to R$ with $ψ_{δ, q} (x, y) = Φ (t) - δ t^{q}$ , then ψ is twice differentiable and

ψ_{δ, q}^{''} (x, y) = t^{q - 2} q (q - 1) [\frac{t^{2 - q} Φ^{''} (t)}{q (q - 1)} - δ] \leq 0 for any t \in I

since $q \in (0, 1)$ . Therefore $ψ_{δ, q}$ is concave on I, which implies that $J_{n, ψ_{δ, q}} (x, y) \leq 0$ for any $x, y \in I^{n}$ and, as above, we obtain

J_{n, Φ} (x, y) \leq δ \sum_{i = 0}^{n} [\frac{x_{i}^{q} + y_{i}^{q}}{2} - {(\frac{x_{i} + y_{i}}{2})}^{q}] = δ (q - 1) J_{n, q} (x, y) .

The second inequality in (19) follows by considering the auxiliary function $ψ_{Δ, q} (x, y) : I \to R$ with $ψ_{Δ, q} (x, y) = Δ t^{q} - Φ^{''} (t)$ , and we omit the details. This completes the proof. □

Theorem 10 Let $Φ : I \subset (0, \infty) \to R$ be a twice differentiable function on I. If there exist the constants $ω < Ω$ such that

ω \leq t^{2} Φ^{''} (t) \leq Ω for any t \in I,

(20)

then we have the bounds

ω J_{n, 0} (x, y) \leq J_{n, Φ} (x, y) \leq Ω J_{n, 0} (x, y) for any x, y \in I^{n} .

(21)

If there exist the constants $λ < Λ$ such that

λ \leq t Φ^{''} (t) \leq Λ for any t \in I,

(22)

then we have the bounds

λ J_{n, 1} (x, y) \leq J_{n, Φ} (x, y) \leq Λ J_{n, 1} (x, y) for any x, y \in I^{n} .

(23)

Proof Consider the auxiliary function $g_{ω, 0} : I \to R$ with $g_{ω, 0} (t) = Φ (t) + ω log t$ . We observe that $g_{ω, 0}$ is twice differentiable, and by (20) we have $g_{ω, 0}^{''} (t) = t^{- 2} (t^{2} Φ^{''} (t) - ω) \geq 0$ for any $t \in I$ , then we can conclude that $g_{ω, 0}$ is a convex function on I. Therefore we have $J_{n, g_{ω, 0}} (x, y) \geq 0$ for any $x, y \in I^{n}$ , which implies that

\begin{aligned} 0 & \leq J_{n, Φ} (x, y) + ω \sum_{i = 1}^{n} [\frac{log x_{i} + log y_{i}}{2} - log (\frac{x_{i} + y_{i}}{2})] \\ = J_{n, Φ} (x, y) - ω J_{n, 0} (x, y) \end{aligned}

and the first inequality in (21) is proved. Now, consider the auxiliary function $g_{Ω, 0} : I \to R$ with $g_{Ω, 0} (t) = - Ω log t - Φ (t)$ . Then $g_{Ω, 0}^{''} (t) = t^{- 2} (Ω - t^{2} Φ^{''} (t))$ for any $t \in I$ ; and by (20) it is a convex function on I. By similar arguments, we deduce the second inequality in (21).

To prove the second part of the theorem, consider the auxiliary function $g_{λ, 1} : I \to R$ , $g_{λ, 1} (t) = Φ (t) - λ t log t$ . We observe that $g_{λ, 1}$ is twice differentiable and $g_{λ, 1}^{''} (t) : = Φ^{''} (t) - \frac{1}{t} λ$ , for $t \in I$ . Since by (22) we have $g_{λ, 1}^{''} (t) = t^{- 1} (t Φ^{''} (t) - λ) \geq 0$ for all $t \in I$ , then we can conclude that $g_{λ, 1}^{''}$ is a convex function on I. The proof now follows along the lines outlined above and the first part of (23) is proved. The second part of (23) also follows by employing the auxiliary function $g_{Λ, 1} : I \to R$ , $g_{Λ, 1} (t) = Λ t log t - Φ (t)$ ; and this completes the proof. □

5 Applications to some elementary functions

We consider the approximations mentioned in Section 4 for some elementary functions.

We consider the function $Φ (t) = e^{- t}$ for $t \in [a, b] \subset [0, 1]$ and have the following bounds for all $x, y \in {[a, b]}^{n}$ :

\begin{aligned} a^{2} e^{- a} J_{n, 0} (x, y) \leq J_{n, Φ} (x, y) \leq b^{2} e^{- b} J_{n, 0} (x, y), \\ a e^{- a} J_{n, 1} (x, y) \leq J_{n, Φ} (x, y) \leq b e^{- b} J_{n, 1} (x, y), \\ \frac{1}{2} e^{- b} J_{n, 2} (x, y) \leq J_{n, Φ} (x, y) \leq \frac{1}{2} e^{- a} J_{n, 2} (x, y) . \end{aligned}

In what follows, we apply these bounds to the above function on the interval $[0.1, 1]$ , where $x = (0.2, 0.25, 0.3, \dots, 1)$ and $y = (1, \dots, 1)$ (cf. Figure 1).

Discussion In this example, the best lower approximation (amongst the three) is given by $\frac{1}{2} e^{- 1} J_{n, 2} (x, y)$ , and the best upper approximation is given by $e^{- 1} J_{n, 1} (x, y)$ , where $x = (0.2, 0.25, 0.3, \dots, 1)$ and $y = (1, \dots, 1)$ . However, it remains an open question whether this is true in general.

We consider the Havrda-Charvát function

Φ_{α} (t) = {\begin{matrix} {(α - 1)}^{- 1} (t^{α} - t) & if α \neq 1, \\ t log (t) & if α = 1 . \end{matrix}

For $α = 1$ , we have the following bounds for all $x, y \in {[a, b]}^{n}$ :

\begin{aligned} a J_{n, 0} (x, y) \leq J_{n, Φ} (x, y) \leq b J_{n, 0} (x, y) for [a, b] \subset [0, \infty); \\ a J_{n, 1} (x, y) \leq J_{n, Φ} (x, y) \leq b J_{n, 1} (x, y) for 0 \leq a \leq 1 \leq b < \infty . \end{aligned}

We have the following bounds for all $x, y \in {[a, b]}^{n}$ :

\begin{aligned} α a^{α} J_{n, 0} (x, y) \leq J_{n, Φ} (x, y) \leq α b^{α} J_{n, 0} (x, y) for α \geq 0, α \neq 1 and [a, b] \subset [0, \infty), \\ α a^{α - 1} J_{n, 1} (x, y) \leq J_{n, Φ} (x, y) \leq α b^{α - 1} J_{n, 1} (x, y) for α > 0 and [a, b] \subset [0, \infty) . \end{aligned}

In Figure 2, we apply these bounds to the above function on the interval $[0.1, 1]$ , where $x = (0.2, 0.201, 0.202, \dots, 1)$ , $y = (1, \dots, 1)$ , $α = 3 / 2$ .

We also have, for all $x, y \in {[a, b]}^{n}$ ,

γ (p - 1) J_{n, p} (x, y) \leq J_{n, Φ} (x, y) \leq Γ (p - 1) J_{n, p} (x, y)

for $p \in (- \infty, 0) \cup (1, \infty)$ , where

Γ = α \frac{b^{α - p}}{p (p - 1)} and γ = α \frac{a^{α - p}}{p (p - 1)}

for $α \geq p$ and $[a, b] \subset [0, \infty)$ .

In Figure 3, we apply these bounds to the above function on the interval $[0.1, 1]$ , where $x = (0.2, 0.201, 0.202, \dots, 1)$ , $y = (1, \dots, 1)$ , $α = 3$ and $p = 3 / 2, 2$ .

Similarly, we have, for all $x, y \in {[a, b]}^{n}$ ,

δ (q - 1) J_{n, q} (x, y) \geq J_{n, Φ} (x, y) \geq Δ (q - 1) J_{n, q} (x, y)

for $q \in (0, 1)$ and $α > 1$ , where

Δ = α \frac{a^{α - q}}{q (q - 1)} and δ = α \frac{b^{α - q}}{q (q - 1)}

for $[a, b] \subset [0, \infty)$ . In Figure 4, we apply these bounds to the above function on the interval $[0.1, 1]$ , where $x = (0.2, 0.201, 0.202, \dots, 1)$ , $y = (1, \dots, 1)$ , $q = 1 / 2$ and $α = 3$ .

Discussion In this example, the best lower approximation (amongst the five) is given by $2 {(0.1)}^{3 / 2} J_{n, 3 / 2}$ , and the best upper approximation is given by $(3 / 2) J_{n, 1} (x, y)$ , where $x = (0.2, 0.201, 0.202, \dots, 1)$ , $y = (1, \dots, 1)$ . However, it remains an open question whether this is true in general.

References

Dragomir SS: Some reverses of the Jensen inequality with applications. RGMIA Research Report Collection (Online) 2011., 14: Article ID v14a72
Google Scholar
Burbea J, Rao CR: On the convexity of some divergence measures based on entropy functions. IEEE Trans. Inf. Theory 1982, 28(3):489–495. 10.1109/TIT.1982.1056497
Article MathSciNet Google Scholar
Havrda ME, Charvát F: Quantification method of classification processes: concept of structural α -entropy. Kybernetika 1967, 3: 30–35.
MathSciNet Google Scholar
Lin J: Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37(1):145–151. 10.1109/18.61115
Article Google Scholar
Grosse I, Bernaola-Galvan P, Carpena P, Roman-Roldan R, Oliver J, Stanley HE: Analysis of symbolic sequences using the Jensen-Shannon divergence. Phys. Rev. E, Stat. Nonlinear Soft Matter Phys. 2002., 65(4): Article ID 041905. doi:10.1103/PhysRevE.65.041905
Google Scholar
Kullback S, Leibler RA: On information and sufficiency. Ann. Math. Stat. 1951, 22: 79–86. 10.1214/aoms/1177729694
Article MathSciNet Google Scholar
Csiszar I: Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hung. 1967, 2: 299–318.
MathSciNet Google Scholar
Shannon CE: A mathematical theory of communications. Bell Syst. Tech. J. 1948, 27: 379–423. 623–565
Article MathSciNet Google Scholar
Menendez ML, Pardo JA, Pardo L: Some statistical applications of generalized Jensen difference divergence measures for fuzzy information systems. Fuzzy Sets Syst. 1992, 52: 169–180. 10.1016/0165-0114(92)90047-8
Article MathSciNet Google Scholar
Arvey AJ, Azad RK, Raval A, Lawrence JG: Detection of genomic islands via segmental genome heterogeneity. Nucleic Acids Res. 2009, 37(16):5255–5266. 10.1093/nar/gkp576
Article Google Scholar
Gómez RM, Rosso OA, Berretta R, Moscato P: Uncovering molecular biomarkers that correlate cognitive decline with the changes of Hippocampus’ gene expression profiles in Alzheimer’s disease. PLoS ONE 2010., 5(4): Article ID e10153. doi:10.1371/journal.pone.0010153
Google Scholar
Dragomir SS, Dragomir NM, Sherwell D: Sharp bounds for the Jensen divergence with applications. RGMIA Research Report Collection (Online) 2011., 14: Article ID v14a47
Google Scholar
Bullen PS Mathematics and Its Applications 560. In Handbook of Means and Their Inequalities. Kluwer Academic, Dordrecht; 2003.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computational and Applied Mathematics, University of the Witwatersrand, Private Bag 3, Wits 2050, Johannesburg, South Africa
Eder Kikianty, Sever S Dragomir, Isia T Dintoe & David Sherwell
School of Engineering and Science, Victoria University, P.O. Box 14428, Melbourne, Victoria, 8001, Australia
Sever S Dragomir

Authors

Eder Kikianty
View author publications
You can also search for this author in PubMed Google Scholar
Sever S Dragomir
View author publications
You can also search for this author in PubMed Google Scholar
Isia T Dintoe
View author publications
You can also search for this author in PubMed Google Scholar
David Sherwell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eder Kikianty.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

EK, SSD, ITD and DS contributed equally in all stages of writing the paper. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kikianty, E., Dragomir, S.S., Dintoe, I.T. et al. Approximations of Jensen divergence for twice differentiable functions. J Inequal Appl 2013, 267 (2013). https://doi.org/10.1186/1029-242X-2013-267

Download citation

Received: 16 November 2012
Accepted: 09 May 2013
Published: 28 May 2013
DOI: https://doi.org/10.1186/1029-242X-2013-267

Approximations of Jensen divergence for twice differentiable functions

Abstract

1 Introduction

2 Definitions, notation and previous results

2.1 Definitions and notation

2.2 Previous results

3 Approximating with Jensen divergence for power functions

4 Further approximations

5 Applications to some elementary functions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords