- Research
- Open access
- Published:
RETRACTED ARTICLE: Generalization of the Levinson inequality with applications to information theory
Journal of Inequalities and Applications volume 2019, Article number: 212 (2019)
Abstract
In the presented paper, Levinson’s inequality for 3-convex function is generalized by using two Green’s functions. Čebyšev, Grüss, and Ostrowski-type new bounds are found for the functionals involving data points of two types. Moreover, the main results are applied to information theory via f-divergence, Rényi divergence, Rényi entropy, Shannon entropy, and Zipf–Mandelbrot law.
1 Introduction and preliminaries
In [12], Ky Fan’s inequality is generalized by Levinson for 3-convex functions as follows:
Theorem A
Let \(f :I=(0, 2\alpha ) \rightarrow \mathbb{R}\)with \(f^{(3)}(t) \geq 0\). Let \(x_{k} \in (0, \alpha )\)and \(p_{k}>0\). Then
where
Working with the divided differences, the assumptions of differentiability onfcan be weakened.
In [18], Popoviciu noted that (1) is valid on \((0, 2a)\) for 3-convex functions, while in [2], Bullen gave a different proof of Popoviciu’s result and also the converse of (1).
Theorem B
(a) Let \(f:I=[a, b] \rightarrow \mathbb{R}\)be a 3-convex function and \(x_{n}, y_{n} \in [a, b]\)for \(n=1, 2, \ldots , k \)such that
and \(p_{n}>0\). Then
where
(b) Iffis continuous and \(p_{\rho }>0\), (4) holds for all \(x_{\rho }\), \(y_{\rho }\)satisfying (3), thenfis 3-convex.
In [17], Pečarić weakened assumption (3) and proved that inequality (1) still holds, i.e., the following result holds:
Theorem C
Let \(f:I=[a, b] \rightarrow \mathbb{R}\)be a 3-convex function, \(p_{k}>0\), and let for \(k=1, \ldots , n\), \(x_{k}\), \(y_{k}\)be such that \(x_{k}+y_{k}=2\breve{c}\), \(x_{k}+x_{n-k+1} \leq 2\breve{c}\)and \(\frac{p_{k}x_{k}+p_{n-k+1}x_{n-k+1}}{p_{k}+p _{n-k+1}} \leq \breve{c}\). Then (4) holds.
In [15], Mercer made a notable work by replacing the condition of symmetric distribution of points \(x_{i}\) and \(y_{i}\) with symmetric variances of points \(x_{i}\) and \(y_{i}\). The second condition is a weaker condition.
Theorem D
Letfbe a 3-convex function on \([a, b]\), \(p_{k}\)be positive such that \(\sum_{k=1}^{n}p_{k}=1\). Also let \(x_{k}\), \(y_{k}\)satisfy (3) and
Then (1) holds.
On the other hand, the error function \(e_{\mathcal{F}}(t)\) can be represented in terms of the Green’s function \(G_{\mathcal{F}, n}(t, s)\) of the boundary value problem
where
The following result holds in [1]:
Theorem E
Let \(f \in C^{n}[a, b]\), and let \(P_{F}\)be its ’two-point right focal’ interpolating polynomial. Then, for \(a \leq a_{1} < a_{2} \leq b\)and \(0 \leq p \leq n-2\),
where \(G_{F, n}(t, s)\)is the Green’s function, defined by (7).
Let \(f \in C^{n}[a, b]\), and let \(P_{F}\) be its ‘two-point right focal’ interpolating polynomial for \(a \leq a_{1} < a_{2} \leq b\). Then, for \(n=3\) and \(p=0\), (8) becomes
where
For \(n=3\) and \(p=1\), (8) becomes
where
The presented work is organized as follows: In Sect. 2, Levinson’s inequality for 3-convex function is generalized by using two Green’s functions defined by (10) and (12). In Sect. 3, Čebyšev, Grüss, and Ostrowski-type new bounds are found for the functionals involving data points of two types. In Sect. 4, the main results are applied to information theory via f-divergence, Rényi divergence, Rényi entropy, Shannon entropy, and Zipf–Mandelbrot law.
2 Main results
First we give an identity involving Jensen’s difference of two different data points. Then we give an equivalent form of identity by using the Green’s function defined by (10) and (12).
Theorem 1
Let \(f\in C^{3}[\zeta _{1}, \zeta _{2}]\)such that \(f: I= [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\), \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\), \((q_{1}, \ldots , q_{m}) \in \mathbb{R}^{m}\)such that \(\sum_{\rho =1}^{n}p_{\rho }=1\)and \(\sum_{\varrho =1}^{m}q_{\varrho }=1\). Also let \(x_{\rho }\), \(y_{\varrho }\), \(\sum_{\rho =1}^{n}p_{\rho }x _{\rho }\), \(\sum_{\varrho =1}^{m}q_{\varrho }y_{\varrho } \in I\). Then
where
and
for \(G_{k}(\cdot, s)\) (\(k=1, 2\)) defined in (10) and (12) respectively.
Proof
(i) For \(k=1\).
After rearranging, we have (13).
(ii) For \(k=2\)
Using (11) in (14) and following similar steps as in the proof of (i), we get (13). □
Corollary 1
Let \(f\in C^{3}[0, 2\alpha ]\)such that \(f: I= [0, 2\alpha ] \rightarrow \mathbb{R}\), \(x_{1}, \ldots , x_{n} \in (0, \alpha )\), \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\)such that \(\sum_{\rho =1}^{n}p _{\rho }=1\). Also let \(x_{\rho }\), \(\sum_{\rho =1}^{n}p_{\rho }(2\alpha -x_{\rho })\), \(\sum_{\rho =1}^{n}p_{\rho }x_{\rho } \in I\). Then
where \(J(f(\cdot))\)and \(J(G(\cdot, s))\)are defined in (14) and (15) respectively.
Proof
Choosing \(I=[0, 2\alpha ]\), \(y_{\varrho }=(2\alpha -x_{\rho })\), \(x_{1}, \ldots , x_{n} \in (0, \alpha )\), \(p_{\rho }=q_{\varrho }\), and \(m=n\) in Theorem 1, after simplification we get (16). □
Theorem 2
Let \(f: I= [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)be a3-convex function. Also let \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\), \((q_{1}, \ldots , q_{m})\in \mathbb{R}^{m}\)be such that \(\sum_{\rho =1}^{n}p_{\rho }=1\)and \(\sum_{\varrho =1}^{m}q_{ \varrho }=1\)and \(x_{\rho }\), \(y_{\varrho }\), \(\sum_{\rho =1}^{n}p_{ \rho }x_{\rho }\), \(\sum_{\varrho =1}^{m}q_{\varrho }y_{\varrho } \in I\).
If
then the following statements are equivalent:
For \(f \in C^{3}[\zeta _{1}, \zeta _{2}]\),
For all \(s \in I\),
where \(G_{k}(\cdot, s)\)are defined by (10) and (12) for \(k=1, 2\)respectively.
Moreover, inequality in (18) is reversed iff inequality in (19) is reversed.
Proof
(18) ⇒ (19): Let (18) be valid. Then, as the function \(G_{k}(\cdot, s)\) (\(s \in I\)) is also continuous and 3-convex, it follows that also for this function (18) holds, i.e., (19) is valid.
(19) ⇒ (18): If f is 3-convex, then without loss of generality we can suppose that there exists the third derivative of f. Let \(f \in C^{3}[\zeta _{1}, \zeta _{2}]\) be a 3-convex function and (19) hold. Then we can represent function f in the form (9). Now, by means of some simple calculations, we can write
By the convexity of f, we have \(f^{(3)}(s) \geq 0\) for all \(s \in I\). Hence, if for every \(s \in I\), (19) is valid, then it follows that for every 3-convex function \(f:I \rightarrow \mathbb{R}\), with \(f \in C^{3}[\zeta _{1}, \zeta _{2}]\), (18) is valid. □
Remark 1
If the expression
and \(f^{(2)}(\zeta _{2})\) have different signs in (17), then inequalities (18) and (19) are reversed.
Next we have the results about generalization of Bullen-type inequality (for real weights) given in [2] (see also [16] and [11]).
Corollary 2
Let \(f: I= [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)be a3-convex function and \(f \in C^{3}[\zeta _{1}, \zeta _{2}]\), \(x_{1}, \ldots , x_{n} \), \(y_{1}, \ldots , y_{m} \in I\)such that
and
Also let \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\), \((q_{1}, \ldots , q_{m})\in \mathbb{R}^{m}\)be such that \(\sum_{\rho =1}^{n}p _{\rho }=1\)and \(\sum_{\varrho =1}^{m}q_{\varrho }=1\)and \(x_{\rho }\), \(y_{\varrho }\), \(\sum_{\rho =1}^{n}p_{\rho }x_{\rho }\), \(\sum_{\varrho =1} ^{m}q_{\varrho }y_{\varrho } \in I\). If (17) holds, then (18) and (19) are equivalent.
Proof
By choosing \(x_{\rho }\) and \(y_{\varrho }\) such that conditions (20) and (21) hold in Theorem 2, we get the required result. □
Remark 2
If \(p_{\rho }=q_{\varrho }\) are positive and \(x_{\rho }\), \(y_{\varrho }\) satisfy (20) and (21), then inequality (18) reduces to Bullen’s inequality given in [16, p. 32, Theorem 2] for \(m=n\).
Next we have a generalized form (for real weights) of Bullen-type inequality given in [17] (see also [16]).
Corollary 3
Let \(f: I= [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)be a3-convex function and \(f \in C^{3}[\zeta _{1}, \zeta _{2}]\), \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\), \((q_{1}, \ldots , q_{m}) \in \mathbb{R}^{m}\)be such that \(\sum_{\rho =1}^{n}p_{\rho }=1\)and \(\sum_{\varrho =1}^{m}q_{\varrho }=1\). Also let \(x_{1}, \ldots , x _{n} \)and \(y_{1}, \ldots , y_{m} \in I\)be such that \(x_{\rho }+y _{\varrho }=2c\), and for \(\rho =1, \ldots , n\), \(x_{\rho }+x_{n-\rho +1} \)and \(\frac{p_{\rho }x_{\rho }+p_{n-\rho +1}x_{n-\rho +1}}{p_{\rho }+p _{n-\rho +1}} \leq c\). If (17) holds, then (18) and (19) are equivalent.
Proof
Using Theorem 2 with the conditions given in the statement, we get the required result. □
Remark 3
In Theorem 2, if \(m=n\), \(p_{\rho }=q_{\varrho }\) are positive, \(x_{\rho }+y_{\varrho }=2c\), \(x_{\rho }+x_{n-\rho +1} \) and \(\frac{p_{\rho }x_{\rho }+p_{n-\rho +1}x_{n-\rho +1}}{p_{\rho }+p_{n- \rho +1}} \leq c\). Then (18) reduces to a generalized form of Bullen’s inequality defined in [16, p. 32, Theorem 4].
In [15], Mercer made a notable work by replacing condition (21) of symmetric distribution of points \(x_{\rho }\) and \(y_{\varrho }\) with symmetric variances of points \(x_{\rho }\) and \(y_{\varrho }\) for \(\rho =1, \ldots , n\) and \(\varrho =1, \ldots , m\).
So in the next result we use Mercer’s condition (6), but for \(\rho =\varrho \) and \(m=n\).
Corollary 4
Let \(f: I= [\zeta _{1}, \zeta _{1}] \rightarrow \mathbb{R}\)be a3-convex function and \(f \in C^{3}[\zeta _{1}, \zeta _{2}]\), \(p_{\rho }\), \(q_{\rho }\)be positive such that \(\sum_{\rho =1}^{n}p _{\rho }=1\)and \(\sum_{\rho =1}^{n}q_{\rho }=1\). Also let \(x_{\rho }\), \(y_{\rho }\)satisfy (20) and
If (17) holds, then (18) and (19) are equivalent.
Proof
For positive weights, using (6) and (20) in Theorem 2, we get the required result. □
Next we have the results that lean on the generalization of Levinson-type inequality given in [12] (see also [16]).
Corollary 5
Let \(f: I= [0, 2\alpha ] \rightarrow \mathbb{R}\)be a3-convex function and \(f \in C^{3}[0, 2\alpha ]\), \(x_{1}, \ldots , x_{n} \in (0, \alpha )\), \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\)and \(\sum_{\rho =1}^{n}p_{\rho }=1\). Also let \(x_{\rho }\), \(\sum_{\rho =1}^{n}p_{\rho }(2\alpha -x_{\rho })\), \(\sum_{\rho =1}^{n}p_{\rho }x_{\rho } \in I\). Then the following are equivalent:
For all \(s \in I\),
where \(G_{k}(\cdot, s)\)is defined in (10) and (12) for \(k=1, 2\)respectively.
Proof
If \(I=[0, 2\alpha ]\), \((x_{1}, \ldots , x_{n}) \in (0, \alpha )\), \(p_{\rho }=q_{\varrho }\), \(m=n\), and \(y_{\varrho }=(2\alpha -x_{ \rho })\) in Theorem 2 with \(0 \leq \zeta _{1} < \zeta _{2} \leq 2\alpha \), we get the required result. □
Remark 4
In Corollary 5, if \(p_{\rho }\) are positive, then inequality (23) reduces to Levinson’s inequality given in [16, p. 32, Theorem 1].
3 New bounds for Levinson-type functionals
Consider the Čebyšev functional for two Lebesgue integrable functions \(f_{1}, f_{2}: [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)
where the integrals are assumed to exist.
Theorem F
([3])
Let \(f_{1} : [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)be a Lebesgue integrable function and \(f_{2} : [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)be an absolutely continuous function with \((\cdot, -\zeta _{1})(\cdot, -\zeta _{2})[f'_{2}]^{2} \in L[\zeta _{1}, \zeta _{2}]\). Then
\(\frac{1}{\sqrt{2}}\)is the best possible.
Theorem G
([3])
Let \(f_{1}: [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)be absolutely continuous with \(f^{\prime }_{1} \in L_{\infty }[ \zeta _{1}, \zeta _{2}]\), and let \(f_{2}: [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)be monotonic nondecreasing on \([\zeta _{1}, \zeta _{2}]\). Then
\(\frac{1}{2}\)is the best possible.
In the next result we construct the Čebyšev-type bound for our functional defined in (5).
Theorem 3
Let \(f\in C^{3}[\zeta _{1}, \zeta _{2}]\)be such that \(f: I= [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)and \(f^{(3)}(\cdot)\)is absolutely continuous with \((\cdot-\zeta _{1})(\zeta _{2} -\cdot)[f^{(4)}]^{2} \in L[\zeta _{1}, \zeta _{2}]\). Also let \((p_{1}, \ldots , p_{n}) \in \mathbb{R} ^{n}\), \((q_{1}, \ldots , q_{m}) \in \mathbb{R}^{m}\)be such that \(\sum_{\rho =1}^{n}p_{\rho }=1\), \(\sum_{\varrho =1}^{m}q_{\varrho }=1\), \(x_{\rho }\), \(y_{\varrho }\), \(\sum_{\rho =1}^{n}p_{\rho }x_{\rho }\), \(\sum_{\varrho =1}^{m}q_{\varrho }y_{\varrho } \in I\). Then
where \(J(f(\cdot))\), \(J(G_{k}(\cdot, s))\)are defined in (14) and (15) respectively, and the remainder \(\mathcal{R}_{3}(\zeta _{1}, \zeta _{2}; f)\)satisfies the bound
for \(G_{k}(\cdot, s)\) (\(k=1, 2\)) defined in (10) and (12) respectively.
Proof
Setting \(f_{1} \mapsto J(G_{k}(\cdot, s))\) and \(f_{2} \mapsto f^{(3)}\) in Theorem F, we get
Multiplying \((\zeta _{2} - \zeta _{1})\) on both sides of the above inequality and using estimation (29), we get
Using identity (13), we get (28). □
In the next result the bounds of Grüss-type inequalities are estimated.
Theorem 4
Let \(f\in C^{3}[\zeta _{1}, \zeta _{2}]\)be such that \(f: I= [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\), \(f^{(3)}(\cdot)\)is absolutely continuous and \(f^{(4)}(\cdot) \geq 0\)a.e. on \([\zeta _{1}, \zeta _{2}]\). Also let \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\), \((q_{1}, \ldots , q_{m}) \in \mathbb{R}^{m}\)be such that \(\sum_{\rho =1}^{n}p _{\rho }=1\), \(\sum_{\varrho =1}^{m}q_{\varrho }=1\), \(x_{\rho }\), \(y_{ \varrho }\), \(\sum_{\rho =1}^{n}p_{\rho }x_{\rho }\), \(\sum_{\varrho =1} ^{m}q_{\varrho }y_{\varrho } \in I\). Then identity (28) holds, where the remainder satisfies the estimation
Proof
Setting \(f_{1} \mapsto J(G_{k}(\cdot, s))\) and \(f_{2} \mapsto f^{(3)}\) in Theorem G, we get
Since
using (13), (31), and (32), we have (28). □
Ostrowski-type bounds for a newly constructed functional defined in (5).
Theorem 5
Let \(f\in C^{3}[\zeta _{1}, \zeta _{2}]\)be such that \(f: I= [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)and \(f^{(2)}(\cdot)\)is absolutely continuous. Also let \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{n}\), \((q_{1}, \ldots , q_{m}) \in \mathbb{R}^{m}\)be such that \(\sum_{ \rho =1}^{n}p_{\rho }=1\), \(\sum_{\varrho =1}^{m}q_{\varrho }=1\), \(x_{\rho }\), \(y_{\varrho }\), \(\sum_{\rho =1}^{n}p_{\rho }x_{\rho }\), \(\sum_{\varrho =1}^{m}q_{\varrho }y_{\varrho } \in I\). Also let \((r, s)\)be a pair of conjugate exponents, that is, \(1 \leq r, s, \leq \infty \), \(\frac{1}{r}+\frac{1}{s}=1\). If \(|f^{(3)}|^{r}: [\zeta _{1}, \zeta _{2}] \rightarrow \mathbb{R}\)is a Riemann integrable function, then
Proof
Rearrange identity (13) in the following way:
Employing the classical Holder’s inequality to R.H.S of (34) yields (33). □
4 Application to information theory
The idea of Shannon entropy is the focal job of data hypothesis once in a while alluded as measure of uncertainty. The entropy of a random variable is characterized regarding its probability distribution and can be shown to be a decent measure of randomness or uncertainty. Shannon entropy permits to evaluate the normal least number of bits expected to encode a series of images dependent on the letters in order size and the recurrence of the symbols.
Divergences between probability distributions have been acquainted with measure of the difference between them. A variety of sorts of divergences exist, for instance the f-difference (particularly, Kullback–Leibler divergence, Hellinger distance, and total variation distance), Rényi divergence, Jensen–Shannon divergence, and so forth (see [13, 21]). There are a lot of papers managing inequalities and entropies, see, e.g., [8, 10, 20] and the references therein. Jensen’s inequality assumes a crucial role in a portion of these inequalities. In any case, Jensen’s inequality deals with one sort of information focuses and Levinson’s inequality manages two types of information points.
Zipf’s law is one of the central laws in data science, and it has been utilized in linguistics. George Zipf in 1932 found that we can tally how frequently each word shows up in the content. So on the off chance that we rank (r) word as per the recurrence of word event \((f)\), at that point the result of these two numbers is steady \((C): C = r \times f\). Aside from the utilization of this law in data science and linguistics, Zipf’s law is utilized in city population, sun powered flare power, site traffic, earthquake magnitude, the span of moon pits, and so forth. In financial aspects this distribution is known as the Pareto law, which analyzes the distribution of the wealthiest individuals in the community [6, p. 125]. These two laws are equivalent in the mathematical sense, yet they are involved in different contexts [7, p. 294].
4.1 Csiszár divergence
In [4, 5] Csiszár gave the following definition:
Definition 1
Let f be a convex function from \(\mathbb{R}^{+}\) to \(\mathbb{R}^{+}\). Let \(\tilde{\mathbf{r}}, \tilde{\mathbf{k}} \in \mathbb{R}_{+}^{n}\) be such that \(\sum_{s=1}^{n}r_{s}=1\) and \(\sum_{s=1}^{n}q_{s}=1\). Then an f-divergence functional is defined by
By defining the following:
he stated that nonnegative probability distributions can also be used.
Using the definition of f-divergence functional, Horv́ath et al. [9] gave the following functional:
Definition 2
Let I be an interval contained in \(\mathbb{R}\) and \(f: I \rightarrow \mathbb{R}\) be a function. Also let \(\tilde{\mathbf{r}}=(r_{1}, \ldots , r_{n})\in \mathbb{R}^{n}\) and \(\tilde{\mathbf{k}}=(k_{1}, \ldots , k_{n})\in (0, \infty )^{n}\) be such that
Then
We apply a generalized form of Bullen’s inequality (18) (for positive weights) to \(\hat{I}_{f}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})\).
Let us denote the following set of assumptions by \(\mathcal{G}\):
Let \(f: I= [\alpha , \beta ] \rightarrow \mathbb{R}\) be a 3-convex function. Also let \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{+}\), \((q_{1}, \ldots , q_{m})\in \mathbb{R}^{+}\) be such that \(\sum_{s=1} ^{n}p_{s}=1\) and \(\sum_{s=1}^{m}q_{s}=1\) and \(x_{s}\), \(y_{s}\), \(\sum_{s=1} ^{n}p_{s}x_{s}\), \(\sum_{s=1}^{m}q_{s}y_{s} \in I\).
Theorem 6
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}}= (r_{1}, \ldots , r_{n} )\), \(\tilde{\mathbf{k}}= (k_{1}, \ldots , k_{n} )\)be in \((0, \infty )^{n}\), and \(\tilde{\mathbf{w}}= (w_{1}, \ldots , w _{m} )\), \(\tilde{\mathbf{t}}= (t_{1}, \ldots , t_{m} )\)be in \((0, \infty )^{m}\)such that
and
Then
(ii) If \(x \rightarrow xf(x)\) (\(x \in [a, b]\)) is 3-convex, then
where
and
Proof
(i) Taking \(p_{s} = \frac{k_{s}}{\sum_{s=1}^{n}k_{s}}\), \(x_{\rho } = \frac{r_{s}}{k_{s}}\), \(q_{s} = \frac{t_{u}}{\sum_{u=1}^{m}t_{u}}\), and \(y_{s} = \frac{w_{u}}{t_{u}}\) in inequality (18) (for positive weights), we have
Multiplying (37) by the sum \(\sum_{s=1}^{n}k_{s}\), we get
Now again multiplying (38) by the sum \(\sum_{u=1}^{m}t_{u}\), we get
If we divide the above inequality with the product \(\sum_{s=1}^{n}k _{s} \sum_{u=1}^{m}t_{u}\), we get (35).
(ii) Using \(f:=id f\) (where “id” is the identity function) in (18)(for positive weights), we have
Using the same steps as in the proof of (i), we get (36). □
4.2 Shannon entropy
Definition 3
(see [9])
The \(\mathcal{S}\)hannon entropy of positive probability distribution \(\tilde{\mathbf{r}}=(r_{1}, \ldots , r_{n})\) is defined by
Corollary 6
Assume \(\boldsymbol{\mathcal{G}}\).
If \(\tilde{\mathbf{k}}=(k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{t}}=(t_{1}, \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\)and if base of log is greater than 1, then
where \(\boldsymbol{\mathcal{S}}\)is defined in (39), and
If base of log is less than 1, then inequality (40) is reversed.
Proof
The function \(f \mapsto -x\log (x)\) is 3-convex for base of log is greater than 1. So, using \(f:= -x\log (x)\) in Theorem 6(i), we get (40). □
Remark 5
If k and t are positive probability distributions, then (40) becomes
Definition 4
(see [9])
For \(\tilde{\mathbf{r}}\) and \(\tilde{\mathbf{q}}\), where \(\tilde{\mathbf{r}}, \tilde{\mathbf{q}} \in \mathbb{R}_{+}^{n}\) the Kullback–Leibler divergence is defined by
Corollary 7
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}} = (r_{1} , \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1} , \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), and \(\tilde{\mathbf{w}} : = (w_{1}, \ldots , w_{m})\), \(\tilde{\mathbf{t}} = (t_{1} , \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\)be such that \(\sum_{s=1}^{n}r_{s}\), \(\sum_{s=1}^{n}k_{s}\), \(\sum_{s=1} ^{m}w_{s}\), and \(\sum_{s=1}^{m}t_{s}\)be equal to 1, then
where base of log is greater than 1.
If base of log is less than 1, then the signs of inequality in (43) are reversed.
Proof
In Theorem 6(ii), replacing f by \(-x\log (x)\), we have
Now simply taking \({\sum_{s=1}^{n}r_{s}}\), \({\sum_{s=1}^{n}k_{s}}\), \({\sum_{s=1}^{m}w_{s}}\), and \({\sum_{s=1}^{m}t_{s}}\) are equal to 1 and after rearranging, we get (43). □
4.3 Rényi divergence and entropy
The Rényi divergence and Rényi entropy are given in [19].
Definition 5
Let \(\tilde{\mathbf{r}}, \tilde{\mathbf{q}} \in \mathbb{R}_{+}^{n}\) be such that \(\sum_{1}^{n}r_{i}=1\) and \(\sum_{1}^{n}q_{i}=1\), and let \(\delta \geq 0\), \(\delta \neq 1\).
-
(a)
The Rényi divergence of order δ is defined by
$$\begin{aligned} \boldsymbol{\mathcal{D}}_{\delta }(\tilde{\mathbf{r}}, \tilde{\mathbf{q}}) : = \frac{1}{\delta - 1} \log \Biggl(\sum _{i=1} ^{n}q_{i} \biggl( \frac{r_{i}}{q_{i}} \biggr)^{\delta } \Biggr). \end{aligned}$$(45) -
(b)
The Rényi entropy of order δ of \(\tilde{\mathbf{r}}\) is defined by
$$\begin{aligned} \boldsymbol{\mathcal{H}}_{\delta }(\tilde{\mathbf{r}}): = \frac{1}{1 - \delta } \log \Biggl( \sum_{i=1}^{n} r_{i}^{\delta } \Biggr). \end{aligned}$$(46)
These definitions also hold for nonnegative probability distributions. If \(\delta \rightarrow 1\) in (45), we have (42), and if \(\delta \rightarrow 1\) in (46), then we have (39).
Now we obtain inequalities for the Rényi divergence.
Theorem 7
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{w}} = (w_{1}, \ldots , w_{m})\), and \(\tilde{\mathbf{t}} = (t_{1}, \ldots , t_{m})\in \mathbb{R}_{+}^{m}\).
-
(i)
If base of log is greater than 1 and \(0 \leq \delta \leq \theta \)are such that \(\delta , \theta \neq 1\), then
$$ \boldsymbol{\mathcal{D}}_{\theta }(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})-\boldsymbol{\mathcal{D}}_{\delta }( \tilde{\mathbf{r}}, \tilde{\mathbf{k}})\leq \boldsymbol{\mathcal{D}} _{\theta }(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}})-\boldsymbol{\mathcal{D}}_{\delta }(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}}). $$(47)If base of log is less than 1, then inequality (47) holds in reverse.
-
(ii)
If \(\theta >1\)and if base of log is greater than 1, then
$$ \boldsymbol{\mathcal{D}}_{\theta }(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})-\boldsymbol{\mathcal{D}}_{1}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})\leq \boldsymbol{\mathcal{D}}_{\theta }( \tilde{ \mathbf{w}}, \tilde{\mathbf{t}})-\boldsymbol{\mathcal{D}}_{1}( \tilde{\mathbf{w}}, \tilde{\mathbf{t}}). $$(48) -
(iii)
If \(\delta \in [0, 1)\)and if base of log is greater than 1, then
$$ \boldsymbol{\mathcal{D}}_{1}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})-\boldsymbol{\mathcal{D}}_{\delta }(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})\leq \boldsymbol{\mathcal{D}}_{1}(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}})-\boldsymbol{\mathcal{D}}_{\delta }(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}}). $$(49)
Proof
With the mapping f defined by \(f: (0, \infty ) \rightarrow \mathbb{R}\) by \(f(t):= t^{\frac{\theta - 1}{\delta -1}}\) and using
and
in (18) (for positive weights) and after simplifications, we have
if either \(0 \leq \delta < 1 < \gamma \) or \(1 < \delta \leq \theta \), and inequality (50) holds in reverse if \(0 \leq \delta \leq \gamma < 1\). Raising the power \(\frac{1}{\theta - 1}\) in (50),
For base of log is greater than 1, the log function is increasing, therefore on taking log in (51), we get (47). If base of log is less than 1, inequality in (47) is reversed. If \(\delta = 1= \theta \), and by taking the limit, we have (48) and (49) respectively. □
Theorem 8
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{w}} = (w_{1}, \ldots , w_{m})\), and \(\tilde{\mathbf{t}} = (t_{1}, \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\).
If either \(1 < \delta \)and base of log is greater than 1 or \(\delta \in [0, 1)\)and base of log is less than 1, then
If either \(1 < \delta \)and base of log is greater than 1 or \(\delta \in [0, 1)\)and base of log is less than 1, inequality in (52) is reversed.
Proof
The proof is only for the case when \(\delta \in [0, 1)\) and base of log is greater than 1, and similarly the remaining cases are simple to prove.
The function \(x \mapsto xf(x)\) \((x > 0)\) is 3-convex for base of log is less than 1. Also \(0>\frac{1}{1 - \delta }\) and choosing \(I = (0, \infty )\),
and
in (18) (for positive weights) and after simplifications, we have (52). □
Corollary 8
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{w}} = (w_{1}, \ldots , w_{m})\), and \(\tilde{\mathbf{t}} = (t_{1}, \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\)be such that \({\sum_{s=1}^{n}r_{s}}\), \({\sum_{s=1}^{n}k_{s}}\), \({\sum_{u=1}^{m}w_{u}}\), and \({\sum_{u=1}^{m}t_{u}}\)are equal to 1.
-
(i)
If base of log is greater than 1 and \(0 \leq \delta \leq \theta \)such that \(\delta , \theta \neq 1\), then
$$ \boldsymbol{\mathcal{H}}_{\theta }(\tilde{\mathbf{r}})- \boldsymbol{\mathcal{H}}_{\delta }(\tilde{\mathbf{r}})\geq \boldsymbol{ \mathcal{H}}_{\theta }(\tilde{\mathbf{w}})-\boldsymbol{ \mathcal{H}}_{\delta }(\tilde{\mathbf{w}}). $$(53)The reverse inequality holds in (53) if base of log is less than 1.
-
(ii)
If \(1 < \theta \)and base of log is greater than 1, then
$$ \boldsymbol{\mathcal{H}}_{\theta }(\tilde{\mathbf{r}})- \boldsymbol{\mathcal{S}} \geq \boldsymbol{\mathcal{H}}_{\theta }( \tilde{ \mathbf{w}})-\tilde{\boldsymbol{\mathcal{S}}}. $$(54)The reverse inequality holds in (54) if base of log is greater than 1.
-
(iii)
If \(0 \leq \delta < 1\)and base of log is greater than 1, then
$$ \boldsymbol{\mathcal{S}}-\boldsymbol{\mathcal{H}}_{\delta }( \tilde{\mathbf{r}})\geq \tilde{\boldsymbol{\mathcal{S}}}-\boldsymbol{ \mathcal{H}}_{\delta }(\tilde{\mathbf{w}}). $$(55)If base of log is less than 1, the inequality in (55) is reversed.
Proof
(i) Suppose \(\tilde{\mathbf{k}}, \tilde{\mathbf{t}}= \frac{ \textbf{1}}{\textbf{n}}\). Then from (45) we have
and
We have
and
We get (53) after using Theorem 7(i), (56) and (57).
Statements (ii) and (iii) are similarly proved. □
Corollary 9
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n})\), \(\tilde{\mathbf{w}} = (w _{1}, \ldots , w_{m})\), and \(\tilde{\mathbf{t}} = (t_{1}, \ldots , t _{n})\)be positive probability distributions.
If either \(\delta \in [0, 1)\)and base if log is greater than 1, or \(\delta >1\)and base if log is less than 1, then
The inequality in (58) is reversed if either \(\delta \in [0, 1)\)and base if log is less than 1, or \(\delta > 1\)and the base of log is greater than 1.
Proof
Proof is similar to Corollary 8 □
4.4 Zipf–Mandelbrot law
In [14] the authors gave some contribution in analyzing the Zipf–Mandelbrot law which is defined as follows:
Definition 6
The Zipf–Mandelbrot law is a discrete probability distribution depending on three parameters: \(\mathcal{N} \in \{1, 2, \ldots , \}\), \(\phi \in [0, \infty )\), and \(t > 0\), and is defined by
where
For all values of \(\mathcal{N}\), if the total mass of the law is taken, then for \(0 \leq \phi \), \(1< t\), \(s \in \mathcal{N}\), the density function of the Zipf–Mandelbrot law becomes
where
For \(\phi = 0\), the Zipf–Mandelbrot law becomes Zipf’s law.
Conclusion 1
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}}\) and \(\tilde{\mathbf{w}}\) be the Zipf–Mandelbrot laws. By Corollary 8(iii). If \(\delta \in [0, 1)\) and base of log is greater than 1, then
The inequality is reversed if base of log is less than 1.
Change history
15 May 2020
The Publisher has retracted this article [1]. Owing to an administrative error it was published without the amendments made by the authors at the proof stage. A second version of the article with the authors' amendments that should remain in the literature is [2]. The Publisher apologises to the authors and to readers for this error. All authors agree with this retraction.
References
Aras-Gazić, G., Čuljak, V., Pečarić, J., Vukelić, A.: Generalization of Jensen’s inequality by Lidstone’s polynomial and related results. Math. Inequal. Appl. 164, 1243–1267 (2013)
Bullen, P.S.: An inequality of N. Levinson. Publ. Elektroteh. Fak. Univ. Beogr., Ser. Mat. Fiz. 109–112 (1973)
Cerone, P., Dragomir, S.S.: Some new Ostrowski-type bounds for the Čebyšev functional and applications. J. Math. Inequal. 8(1), 159–170 (2014)
Csiszár, I.: Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hung. 2, 299–318 (1967)
Csiszár, I.: Information measures: a critical survey. In: Tans. 7th Prague Conf. on Info. Th., Statist. Decis. Funct., Random Process and 8th European Meeting of Statist., vol. B, pp. 73–86. Academia, Prague (1978)
Diodato, V.: Dictionary of Bibliometrics. Haworth Press, New York (1994)
Egghe, L., Rousseau, R.: Introduction to Informetrics, Quantitative Methods in Library, Documentation and Information Science. Elsevier, New York (1990)
Gibbs, A.L.: On choosing and bounding probability metrics. Int. Stat. Rev. 70, 419–435 (2002)
Horváth, L., Pečarić, Ð., Pečarić, J.: Estimations of f- and Rényi divergences by using a cyclic refinement of the Jensen’s inequality. Bull. Malays. Math. Soc. 42(3) 933–946 (2019)
Khan, K.A., Niaz, T., Pečarić, Ð., Pečarić, J.: Refinement of Jensen’s inequality and estimation of f and Rényi divergence via Montgomery identity. J. Inequal. Appl. 2018, 318 (2018)
Krnić, M., Lovričević, N., Pečarić, J.: Superadditivity of the Levinson functional and applications. Period. Math. Hung. 71(2), 166–178 (2015)
Levinson, N.: Generalization of an inequality of Kay Fan. J. Math. Anal. Appl. 6, 133–134 (1969)
Liese, F., Vajda, I.: Convex Statistical Distances. Teubner-Texte zur Mathematik, vol. 95. Teubner, Leipzig (1987)
Lovričević, N., Pečarić, Ð., Pečarić, J.: Zipf–Mandelbrot law, f-divergences and the Jensen-type interpolating inequalities. J. Inequal. Appl. 2018, 36 (2018)
Mercer, A.McD.: A variant of Jensen’s inequality. J. Inequal. Pure Appl. Math. 4(4), article 73 (2003)
Mitrinović, D.S., Pečarić, J., Fink, A.M.: Classical and New Inequalities in Analysis, vol. 61. Kluwer Academic, Dordrecht (1992)
Pečarić, J.: On an inequality on N. Levinson. Publ. Elektroteh. Fak. Univ. Beogr., Ser. Mat. Fiz. 71–74 (1980)
Popoviciu, T.: Sur une inegalite de N. Levinson. Mathematica 6, 301–306 (1969)
Rényi, A.: On measure of information and entropy. In: Proceeding of the Fourth Berkely Symposium on Mathematics, Statistics and Probability, pp. 547–561 (1960)
Sason, I., Verdú, S.: f-Divergence inequalities. IEEE Trans. Inf. Theory 62, 5973–6006 (2016)
Vajda, I.: Theory of Statistical Inference and Information. Kluwer Academic, Dordrecht (1989)
Acknowledgements
The authors wish to thank the anonymous referees for their very careful reading of the manuscript and fruitful comments and suggestions. The research of the 4th author is supported by the Ministry of Education and Science of the Russian Federation (the Agreement number No. 02.a03.21.0008).
Funding
There is no funding for this work.
Author information
Authors and Affiliations
Contributions
All authors jointly worked on the results and they read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The Publisher has retracted this article. Owing to an administrative error it was published without the amendments made by the authors at the proof stage. A second version of the article with the authors' amendments that should remain in the literature has since been published: https://doi.org/10.1186/s13660-019-2186-4. The Publisher apologises to the authors and to readers for this error. All authors agree with this retraction.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Adeel, M., Khan, K.A., Pečarić, Ð. et al. RETRACTED ARTICLE: Generalization of the Levinson inequality with applications to information theory. J Inequal Appl 2019, 212 (2019). https://doi.org/10.1186/s13660-019-2166-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-019-2166-8