The idea of Shannon entropy is the focal job of data hypothesis once in a while alluded as measure of uncertainty. The entropy of a random variable is characterized regarding its probability distribution and can be shown to be a decent measure of randomness or uncertainty. Shannon entropy permits to evaluate the normal least number of bits expected to encode a series of images dependent on the letters in order size and the recurrence of the symbols.
Divergences between probability distributions have been acquainted with measure of the difference between them. A variety of sorts of divergences exist, for instance the f-difference (particularly, Kullback–Leibler divergence, Hellinger distance, and total variation distance), Rényi divergence, Jensen–Shannon divergence, and so forth (see [13, 21]). There are a lot of papers managing inequalities and entropies, see, e.g., [8, 10, 20] and the references therein. Jensen’s inequality assumes a crucial role in a portion of these inequalities. In any case, Jensen’s inequality deals with one sort of information focuses and Levinson’s inequality manages two types of information points.
Zipf’s law is one of the central laws in data science, and it has been utilized in linguistics. George Zipf in 1932 found that we can tally how frequently each word shows up in the content. So on the off chance that we rank (r) word as per the recurrence of word event \((f)\), at that point the result of these two numbers is steady \((C): C = r \times f\). Aside from the utilization of this law in data science and linguistics, Zipf’s law is utilized in city population, sun powered flare power, site traffic, earthquake magnitude, the span of moon pits, and so forth. In financial aspects this distribution is known as the Pareto law, which analyzes the distribution of the wealthiest individuals in the community [6, p. 125]. These two laws are equivalent in the mathematical sense, yet they are involved in different contexts [7, p. 294].
Csiszár divergence
In [4, 5] Csiszár gave the following definition:
Definition 1
Let f be a convex function from \(\mathbb{R}^{+}\) to \(\mathbb{R}^{+}\). Let \(\tilde{\mathbf{r}}, \tilde{\mathbf{k}} \in \mathbb{R}_{+}^{n}\) be such that \(\sum_{s=1}^{n}r_{s}=1\) and \(\sum_{s=1}^{n}q_{s}=1\). Then an f-divergence functional is defined by
$$\begin{aligned} I_{f}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}}) := \sum _{s=1}^{n}q_{s}f \biggl( \frac{r_{s}}{q_{s}} \biggr). \end{aligned}$$
By defining the following:
$$\begin{aligned} f(0) := \lim_{x \rightarrow 0^{+}}f(x); \qquad 0f \biggl( \frac{0}{0} \biggr):=0; \qquad 0f \biggl(\frac{a}{0} \biggr):= \lim_{x \rightarrow 0^{+}}xf \biggl(\frac{a}{0} \biggr), \quad a>0, \end{aligned}$$
he stated that nonnegative probability distributions can also be used.
Using the definition of f-divergence functional, Horv́ath et al. [9] gave the following functional:
Definition 2
Let I be an interval contained in \(\mathbb{R}\) and \(f: I \rightarrow \mathbb{R}\) be a function. Also let \(\tilde{\mathbf{r}}=(r_{1}, \ldots , r_{n})\in \mathbb{R}^{n}\) and \(\tilde{\mathbf{k}}=(k_{1}, \ldots , k_{n})\in (0, \infty )^{n}\) be such that
$$\begin{aligned} \frac{r_{s}}{k_{s}} \in I, \quad s= 1, \ldots , n. \end{aligned}$$
Then
$$\begin{aligned} \hat{I}_{f}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}}) : = \sum _{s=1} ^{n}k_{s}f \biggl( \frac{r_{s}}{k_{s}} \biggr). \end{aligned}$$
We apply a generalized form of Bullen’s inequality (18) (for positive weights) to \(\hat{I}_{f}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})\).
Let us denote the following set of assumptions by \(\mathcal{G}\):
Let \(f: I= [\alpha , \beta ] \rightarrow \mathbb{R}\) be a 3-convex function. Also let \((p_{1}, \ldots , p_{n}) \in \mathbb{R}^{+}\), \((q_{1}, \ldots , q_{m})\in \mathbb{R}^{+}\) be such that \(\sum_{s=1} ^{n}p_{s}=1\) and \(\sum_{s=1}^{m}q_{s}=1\) and \(x_{s}\), \(y_{s}\), \(\sum_{s=1} ^{n}p_{s}x_{s}\), \(\sum_{s=1}^{m}q_{s}y_{s} \in I\).
Theorem 6
Assume
\(\boldsymbol{\mathcal{G}}\).
Let
\(\tilde{\mathbf{r}}= (r_{1}, \ldots , r_{n} )\), \(\tilde{\mathbf{k}}= (k_{1}, \ldots , k_{n} )\)be in
\((0, \infty )^{n}\), and
\(\tilde{\mathbf{w}}= (w_{1}, \ldots , w _{m} )\), \(\tilde{\mathbf{t}}= (t_{1}, \ldots , t_{m} )\)be in
\((0, \infty )^{m}\)such that
$$\begin{aligned} \frac{r_{s}}{k_{s}} \in I, \quad s = 1, \ldots , n, \end{aligned}$$
and
$$\begin{aligned} \frac{w_{u}}{t_{u}} \in I, \quad u = 1, \ldots , m. \end{aligned}$$
Then
$$ \mathrm{(i)} \quad \frac{1}{\sum_{s=1}^{n}k_{s}}\hat{I}_{f}( \tilde{\mathbf{r}}, \tilde{\mathbf{k}})-f \Biggl(\sum _{s=1}^{n} \frac{r_{s}}{\sum_{s=1}^{n}k_{s}} \Biggr) \leq \frac{1}{\sum_{u=1}^{m}t_{u}}\hat{I}_{f}( \tilde{\mathbf{w}}, \tilde{ \mathbf{t}})- f \Biggl(\sum_{u=1}^{m} \frac{w_{u}}{\sum_{u=1}^{m}t_{u}} \Biggr). $$
(35)
(ii) If
\(x \rightarrow xf(x)\) (\(x \in [a, b]\)) is 3-convex, then
$$ \frac{1}{\sum_{s=1}^{n}k_{s}}\hat{I}_{idf} (\tilde{ \mathbf{r}}, \tilde{\mathbf{k}} )-f \Biggl(\sum_{s=1}^{n} \frac{r_{s}}{\sum_{s=1} ^{n}k_{s}} \Biggr)\leq \frac{1}{\sum_{u=1}^{m}t_{u}}\hat{I}_{idf}( \tilde{\mathbf{w}}, \tilde{\mathbf{t}})- f \Biggl(\sum _{u=1}^{m}\frac{w_{u}}{ \sum_{u=1}^{m}t_{u}} \Biggr), $$
(36)
where
$$ \hat{I}_{idf} (\tilde{\mathbf{r}}, \tilde{\mathbf{k}} )= \sum _{s=1}^{n}r_{s}f \biggl( \frac{r_{s}}{k_{s}} \biggr) $$
and
$$ \hat{I}_{idf} (\tilde{\mathbf{w}}, \tilde{\mathbf{t}} )= \sum _{u=1}^{m}w_{u}f \biggl( \frac{w_{u}}{t_{u}} \biggr). $$
Proof
(i) Taking \(p_{s} = \frac{k_{s}}{\sum_{s=1}^{n}k_{s}}\), \(x_{\rho } = \frac{r_{s}}{k_{s}}\), \(q_{s} = \frac{t_{u}}{\sum_{u=1}^{m}t_{u}}\), and \(y_{s} = \frac{w_{u}}{t_{u}}\) in inequality (18) (for positive weights), we have
$$ \sum_{s=1}^{n} \frac{k_{s}}{\sum_{s=1}^{n}k_{s}}f \biggl(\frac{r_{s}}{k_{s}} \biggr)-f \Biggl( \sum _{s=1}^{n}\frac{r_{s}}{\sum_{s=1}^{n}k_{s}} \Biggr) \leq \sum_{u=1}^{m}\frac{t _{u}}{\sum_{u=1}^{m}t_{u}}f \biggl( \frac{w_{u}}{t_{u}} \biggr)-f \Biggl(\sum_{u=1}^{m} \frac{w _{u}}{\sum_{u=1}^{m}t_{u}} \Biggr). $$
(37)
Multiplying (37) by the sum \(\sum_{s=1}^{n}k_{s}\), we get
$$\begin{aligned} \hat{I}_{f}(\tilde{\mathbf{r}}, \tilde{ \mathbf{k}})-f \Biggl(\sum_{s=1}^{n} \frac{r _{s}}{\sum_{s=1}^{n}k_{s}} \Biggr)\sum_{s=1}^{n}k_{s} \leq & \sum_{u=1}^{m} \frac{t _{u}}{\sum_{u=1}^{m}t_{u}}f \biggl(\frac{w_{u}}{t_{u}} \biggr)\sum _{u=1}^{n}k_{u} \\ &{}-f \Biggl(\sum_{u=1}^{m} \frac{w_{u}}{\sum_{u=1}^{m}t_{u}} \Biggr)\sum_{s=1}^{n}k _{s}. \end{aligned}$$
(38)
Now again multiplying (38) by the sum \(\sum_{u=1}^{m}t_{u}\), we get
$$\begin{aligned}& \sum_{u=1}^{m}t_{u} \hat{I}_{f}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})-f \Biggl(\sum _{s=1}^{n}\frac{r_{s}}{\sum_{s=1}^{n}k_{s}} \Biggr) \sum_{s=1}^{n}k_{s}\sum _{u=1}^{m}t_{u} \\& \quad \leq \sum_{s=1}^{n}k_{s} \hat{I}_{f}(\tilde{\mathbf{w}}, \tilde{\mathbf{t}}) -f \Biggl(\sum _{u=1}^{m} \frac{w_{u}}{\sum_{u=1}^{m}t_{u}} \Biggr) \sum_{s=1}^{n}k_{s}\sum _{u=1}^{m}t _{u}. \end{aligned}$$
If we divide the above inequality with the product \(\sum_{s=1}^{n}k _{s} \sum_{u=1}^{m}t_{u}\), we get (35).
(ii) Using \(f:=id f\) (where “id” is the identity function) in (18)(for positive weights), we have
$$\begin{aligned} \sum_{s=1}^{n}p_{s}x_{s}f(x_{s})- \sum_{s=1}^{n}p_{s}x_{s}f \Biggl(\sum_{s=1} ^{n}p_{s}x_{s} \Biggr) \leq \sum_{u=1}^{m}q_{u}y_{u}f(y_{u})- \sum_{u=1}^{m}q _{u}y_{u}f \Biggl(\sum_{u=1}^{m}q_{u}y_{u} \Biggr). \end{aligned}$$
Using the same steps as in the proof of (i), we get (36). □
Shannon entropy
Definition 3
(see [9])
The \(\mathcal{S}\)hannon entropy of positive probability distribution \(\tilde{\mathbf{r}}=(r_{1}, \ldots , r_{n})\) is defined by
$$\begin{aligned} \boldsymbol{\mathcal{S}} : = - \sum _{s=1}^{n}r_{s}\log (r_{s}). \end{aligned}$$
(39)
Corollary 6
Assume
\(\boldsymbol{\mathcal{G}}\).
If
\(\tilde{\mathbf{k}}=(k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{t}}=(t_{1}, \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\)and if base of log is greater than 1, then
$$\begin{aligned}& \frac{1}{\sum_{s=1}^{n}k_{s}} \Biggl[\boldsymbol{\mathcal{S}}+\sum _{s=1} ^{n}r_{s}\log (k_{s}) \Biggr]+ \Biggl[\sum_{s=1}^{n} \frac{r_{s}}{\sum_{s=1}^{n} k_{s}}\log \Biggl(\sum_{s=1}^{n} \frac{r_{s}}{\sum_{s=1} ^{n}k_{s}} \Biggr) \Biggr] \\& \quad \leq \frac{1}{\sum_{u=1}^{m}t_{u}} \Biggl[\tilde{\boldsymbol{\mathcal{S}}}+ \sum _{u=1}^{m}w_{u}\log (t_{u}) \Biggr]+ \Biggl[\sum_{u=1}^{m} \frac{w _{u}}{\sum_{u=1}^{m} t_{u}}\log \Biggl(\sum_{u=1}^{m} \frac{w_{u}}{ \sum_{u=1}^{m}t_{u}} \Biggr) \Biggr], \end{aligned}$$
(40)
where
\(\boldsymbol{\mathcal{S}}\)is defined in (39), and
$$\begin{aligned} \tilde{\boldsymbol{\mathcal{S}}} : = - \sum_{u=1}^{m}w_{u} \log (w_{u}). \end{aligned}$$
If base of log is less than 1, then inequality (40) is reversed.
Proof
The function \(f \mapsto -x\log (x)\) is 3-convex for base of log is greater than 1. So, using \(f:= -x\log (x)\) in Theorem 6(i), we get (40). □
Remark 5
If k and t are positive probability distributions, then (40) becomes
$$\begin{aligned} \Biggl[\boldsymbol{\mathcal{S}}+\sum _{s=1}^{n}r_{s}\log (k_{s}) \Biggr] + \Biggl[\sum_{s=1}^{n}r_{s} \log \Biggl(\sum_{s=1}^{n}r_{s} \Biggr) \Biggr] \leq& \Biggl[\tilde{\boldsymbol{\mathcal{S}}}+\sum _{s=1}^{m}w_{s}\log (t _{s}) \Biggr] \\ &{}+ \Biggl[\sum_{s=1}^{m}w_{s} \log \Biggl(\sum_{s=1}^{m}w_{s} \Biggr) \Biggr]. \end{aligned}$$
(41)
Definition 4
(see [9])
For \(\tilde{\mathbf{r}}\) and \(\tilde{\mathbf{q}}\), where \(\tilde{\mathbf{r}}, \tilde{\mathbf{q}} \in \mathbb{R}_{+}^{n}\) the Kullback–Leibler divergence is defined by
$$\begin{aligned} \boldsymbol{\mathcal{D}}(\tilde{\mathbf{r}}, \tilde{ \mathbf{q}}) : = \sum_{s=1}^{n}r_{s} \log \biggl( \frac{r_{s}}{q_{s}} \biggr). \end{aligned}$$
(42)
Corollary 7
Assume
\(\boldsymbol{\mathcal{G}}\).
Let
\(\tilde{\mathbf{r}} = (r_{1} , \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1} , \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), and
\(\tilde{\mathbf{w}} : = (w_{1}, \ldots , w_{m})\), \(\tilde{\mathbf{t}} = (t_{1} , \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\)be such that
\(\sum_{s=1}^{n}r_{s}\), \(\sum_{s=1}^{n}k_{s}\), \(\sum_{s=1} ^{m}w_{s}\), and
\(\sum_{s=1}^{m}t_{s}\)be equal to 1, then
$$\begin{aligned} \sum_{s=1}^{n} \biggl( \frac{r_{s}}{k_{s}} \biggr) \boldsymbol{\mathcal{D}}(\tilde{\mathbf{r}}, \tilde{ \mathbf{k}})- \sum_{s=1}^{m} \biggl( \frac{w_{s}}{t_{s}} \biggr)\boldsymbol{\mathcal{D}}( \tilde{\mathbf{w}}, \tilde{ \mathbf{t}})\geq 0, \end{aligned}$$
(43)
where base of log is greater than 1.
If base of log is less than 1, then the signs of inequality in (43) are reversed.
Proof
In Theorem 6(ii), replacing f by \(-x\log (x)\), we have
$$\begin{aligned}& \frac{\sum_{s=1}^{n} (\frac{r_{s}}{k_{s}} )}{\sum_{s=1} ^{n}k_{s}}\boldsymbol{\mathcal{D}}(\tilde{ \mathbf{r}}, \tilde{\mathbf{k}}) - \sum_{s=1}^{n} \frac{r_{s}}{ \sum_{s=1}^{n}k _{s}} \log \Biggl(\sum_{s=1}^{n} \frac{r_{s}}{\sum_{s=1}^{n}k_{s}} \Biggr) \\& \quad \geq \frac{\sum_{s=1}^{m} (\frac{w_{s}}{t_{s}} )}{\sum_{s=1} ^{m}t_{s}}\boldsymbol{ \mathcal{D}}( \tilde{\mathbf{w}}, \tilde{\mathbf{t}}) - \sum_{s=1}^{m} \frac{w_{s}}{ \sum_{s=1}^{n}t_{s}}\log \Biggl(\sum_{s=1}^{m} \frac{w_{s}}{\sum_{s=1}^{m}t_{s}} \Biggr). \end{aligned}$$
(44)
Now simply taking \({\sum_{s=1}^{n}r_{s}}\), \({\sum_{s=1}^{n}k_{s}}\), \({\sum_{s=1}^{m}w_{s}}\), and \({\sum_{s=1}^{m}t_{s}}\) are equal to 1 and after rearranging, we get (43). □
Rényi divergence and entropy
The Rényi divergence and Rényi entropy are given in [19].
Definition 5
Let \(\tilde{\mathbf{r}}, \tilde{\mathbf{q}} \in \mathbb{R}_{+}^{n}\) be such that \(\sum_{1}^{n}r_{i}=1\) and \(\sum_{1}^{n}q_{i}=1\), and let \(\delta \geq 0\), \(\delta \neq 1\).
-
(a)
The Rényi divergence of order δ is defined by
$$\begin{aligned} \boldsymbol{\mathcal{D}}_{\delta }(\tilde{\mathbf{r}}, \tilde{\mathbf{q}}) : = \frac{1}{\delta - 1} \log \Biggl(\sum _{i=1} ^{n}q_{i} \biggl( \frac{r_{i}}{q_{i}} \biggr)^{\delta } \Biggr). \end{aligned}$$
(45)
-
(b)
The Rényi entropy of order δ of \(\tilde{\mathbf{r}}\) is defined by
$$\begin{aligned} \boldsymbol{\mathcal{H}}_{\delta }(\tilde{\mathbf{r}}): = \frac{1}{1 - \delta } \log \Biggl( \sum_{i=1}^{n} r_{i}^{\delta } \Biggr). \end{aligned}$$
(46)
These definitions also hold for nonnegative probability distributions. If \(\delta \rightarrow 1\) in (45), we have (42), and if \(\delta \rightarrow 1\) in (46), then we have (39).
Now we obtain inequalities for the Rényi divergence.
Theorem 7
Assume
\(\boldsymbol{\mathcal{G}}\).
Let
\(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{w}} = (w_{1}, \ldots , w_{m})\), and
\(\tilde{\mathbf{t}} = (t_{1}, \ldots , t_{m})\in \mathbb{R}_{+}^{m}\).
-
(i)
If base of log is greater than 1 and
\(0 \leq \delta \leq \theta \)are such that
\(\delta , \theta \neq 1\), then
$$ \boldsymbol{\mathcal{D}}_{\theta }(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})-\boldsymbol{\mathcal{D}}_{\delta }( \tilde{\mathbf{r}}, \tilde{\mathbf{k}})\leq \boldsymbol{\mathcal{D}} _{\theta }(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}})-\boldsymbol{\mathcal{D}}_{\delta }(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}}). $$
(47)
If base of log is less than 1, then inequality (47) holds in reverse.
-
(ii)
If
\(\theta >1\)and if base of log is greater than 1, then
$$ \boldsymbol{\mathcal{D}}_{\theta }(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})-\boldsymbol{\mathcal{D}}_{1}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})\leq \boldsymbol{\mathcal{D}}_{\theta }( \tilde{ \mathbf{w}}, \tilde{\mathbf{t}})-\boldsymbol{\mathcal{D}}_{1}( \tilde{\mathbf{w}}, \tilde{\mathbf{t}}). $$
(48)
-
(iii)
If
\(\delta \in [0, 1)\)and if base of log is greater than 1, then
$$ \boldsymbol{\mathcal{D}}_{1}(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})-\boldsymbol{\mathcal{D}}_{\delta }(\tilde{\mathbf{r}}, \tilde{\mathbf{k}})\leq \boldsymbol{\mathcal{D}}_{1}(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}})-\boldsymbol{\mathcal{D}}_{\delta }(\tilde{ \mathbf{w}}, \tilde{\mathbf{t}}). $$
(49)
Proof
With the mapping f defined by \(f: (0, \infty ) \rightarrow \mathbb{R}\) by \(f(t):= t^{\frac{\theta - 1}{\delta -1}}\) and using
$$ p_{s} : = r_{s}, \qquad x_{s} : = \biggl( \frac{r_{s}}{k_{s}} \biggr)^{\delta - 1}, \quad s = 1, \ldots , n, $$
and
$$ q_{u} : = w_{u}, \qquad y_{u} : = \biggl( \frac{w_{u}}{t_{u}} \biggr)^{\delta - 1}, \quad u = 1, \ldots , m, $$
in (18) (for positive weights) and after simplifications, we have
$$ \sum_{s=1}^{n}k_{s} \biggl(\frac{r_{s}}{k_{s}} \biggr)^{\theta }- \Biggl(\sum _{s=1}^{n}k _{s} \biggl( \frac{r_{s}}{k_{s}} \biggr)^{\delta } \Biggr)^{\frac{\theta -1}{\delta -1}} \leq \sum _{u=1}^{m}t_{u} \biggl( \frac{w_{u}}{t_{u}} \biggr)^{\theta }- \Biggl(\sum _{u=1} ^{m}t_{u} \biggl( \frac{w_{u}}{t_{u}} \biggr)^{\delta } \Biggr)^{ \frac{\theta -1}{\delta -1}} $$
(50)
if either \(0 \leq \delta < 1 < \gamma \) or \(1 < \delta \leq \theta \), and inequality (50) holds in reverse if \(0 \leq \delta \leq \gamma < 1\). Raising the power \(\frac{1}{\theta - 1}\) in (50),
$$ \begin{aligned}[b] &\Biggl(\sum_{s=1}^{n}k_{s} \biggl(\frac{r_{s}}{k_{s}} \biggr)^{\theta } \Biggr)^{\frac{1}{\theta -1}}- \Biggl(\sum_{s=1}^{n}k_{s} \biggl(\frac{r_{s}}{k_{s}} \biggr)^{\delta } \Biggr)^{\frac{1}{ \delta -1}} \\ &\quad \leq \Biggl(\sum_{u=1}^{m}t_{u} \biggl(\frac{w_{u}}{t_{u}} \biggr)^{\theta } \Biggr)^{\frac{1}{ \theta -1}}- \Biggl(\sum_{u=1}^{m}t_{u} \biggl(\frac{w_{u}}{t_{u}} \biggr)^{\delta } \Biggr)^{\frac{1}{ \delta -1}}. \end{aligned} $$
(51)
For base of log is greater than 1, the log function is increasing, therefore on taking log in (51), we get (47). If base of log is less than 1, inequality in (47) is reversed. If \(\delta = 1= \theta \), and by taking the limit, we have (48) and (49) respectively. □
Theorem 8
Assume
\(\boldsymbol{\mathcal{G}}\).
Let
\(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{w}} = (w_{1}, \ldots , w_{m})\), and
\(\tilde{\mathbf{t}} = (t_{1}, \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\).
If either
\(1 < \delta \)and base of log is greater than 1 or
\(\delta \in [0, 1)\)and base of log is less than 1, then
$$\begin{aligned}& \frac{1}{\sum_{s=1}^{n}k_{s} (\frac{r_{s}}{k_{s}} )^{ \delta }} \sum_{s=1}^{n}k_{s} \biggl(\frac{r_{s}}{k_{s}} \biggr)^{ \delta } \log \biggl( \frac{r_{s}}{k_{s}} \biggr)-\boldsymbol{\mathcal{D}}_{\delta }(\tilde{ \mathbf{r}}, \tilde{\mathbf{k}}) \\& \quad \leq \frac{1}{\sum_{s=1}^{n}k_{s} (\frac{r_{s}}{k_{s}} ) ^{\delta }} \sum_{s=1}^{m}t_{s} \biggl(\frac{w_{s}}{t_{s}} \biggr) ^{\delta } \log \biggl( \frac{w_{s}}{t_{s}} \biggr)-\frac{\sum_{s=1} ^{m}t_{s} (\frac{w_{s}}{t_{s}} )^{\delta }}{\sum_{s=1} ^{n}k_{s} (\frac{r_{s}}{k_{s}} )^{\delta }} \boldsymbol{ \mathcal{D}}_{\delta }(\tilde{\mathbf{w}}, \tilde{\mathbf{t}}). \end{aligned}$$
(52)
If either
\(1 < \delta \)and base of log is greater than 1 or
\(\delta \in [0, 1)\)and base of log is less than 1, inequality in (52) is reversed.
Proof
The proof is only for the case when \(\delta \in [0, 1)\) and base of log is greater than 1, and similarly the remaining cases are simple to prove.
The function \(x \mapsto xf(x)\)
\((x > 0)\) is 3-convex for base of log is less than 1. Also \(0>\frac{1}{1 - \delta }\) and choosing \(I = (0, \infty )\),
$$ p_{s} : = r_{s}, \qquad x_{s} : = \biggl( \frac{r_{s}}{k_{s}} \biggr)^{\delta - 1}, \quad s = 1, \ldots , n, $$
and
$$ q_{u} : = w_{u}, \qquad y_{u} : = \biggl( \frac{w_{u}}{t_{u}} \biggr)^{\delta - 1}, \quad u = 1, \ldots , m, $$
in (18) (for positive weights) and after simplifications, we have (52). □
Corollary 8
Assume
\(\boldsymbol{\mathcal{G}}\).
Let
\(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n}) \in \mathbb{R}_{+}^{n}\), \(\tilde{\mathbf{w}} = (w_{1}, \ldots , w_{m})\), and
\(\tilde{\mathbf{t}} = (t_{1}, \ldots , t_{m}) \in \mathbb{R}_{+}^{m}\)be such that
\({\sum_{s=1}^{n}r_{s}}\), \({\sum_{s=1}^{n}k_{s}}\), \({\sum_{u=1}^{m}w_{u}}\), and
\({\sum_{u=1}^{m}t_{u}}\)are equal to 1.
-
(i)
If base of log is greater than 1 and
\(0 \leq \delta \leq \theta \)such that
\(\delta , \theta \neq 1\), then
$$ \boldsymbol{\mathcal{H}}_{\theta }(\tilde{\mathbf{r}})- \boldsymbol{\mathcal{H}}_{\delta }(\tilde{\mathbf{r}})\geq \boldsymbol{ \mathcal{H}}_{\theta }(\tilde{\mathbf{w}})-\boldsymbol{ \mathcal{H}}_{\delta }(\tilde{\mathbf{w}}). $$
(53)
The reverse inequality holds in (53) if base of log is less than 1.
-
(ii)
If
\(1 < \theta \)and base of log is greater than 1, then
$$ \boldsymbol{\mathcal{H}}_{\theta }(\tilde{\mathbf{r}})- \boldsymbol{\mathcal{S}} \geq \boldsymbol{\mathcal{H}}_{\theta }( \tilde{ \mathbf{w}})-\tilde{\boldsymbol{\mathcal{S}}}. $$
(54)
The reverse inequality holds in (54) if base of log is greater than 1.
-
(iii)
If
\(0 \leq \delta < 1\)and base of log is greater than 1, then
$$ \boldsymbol{\mathcal{S}}-\boldsymbol{\mathcal{H}}_{\delta }( \tilde{\mathbf{r}})\geq \tilde{\boldsymbol{\mathcal{S}}}-\boldsymbol{ \mathcal{H}}_{\delta }(\tilde{\mathbf{w}}). $$
(55)
If base of log is less than 1, the inequality in (55) is reversed.
Proof
(i) Suppose \(\tilde{\mathbf{k}}, \tilde{\mathbf{t}}= \frac{ \textbf{1}}{\textbf{n}}\). Then from (45) we have
$$\begin{aligned} \boldsymbol{\mathcal{D}}_{\delta } (\tilde{\mathbf{r}}, \tilde{ \mathbf{q}}) = \frac{1}{\delta - 1} \log \Biggl(\sum _{s=1}^{n}n ^{\delta - 1}r_{s}^{\delta } \Biggr) = \log (n) + \frac{1}{\delta - 1}\log \Biggl(\sum _{s=1}^{n}r_{s}^{\delta } \Biggr) \end{aligned}$$
and
$$\begin{aligned} \boldsymbol{\mathcal{D}}_{\delta } (\tilde{\mathbf{w}}, \tilde{ \mathbf{t}}) = \frac{1}{\delta - 1} \log \Biggl(\sum _{s=1}^{n}n ^{\delta - 1}w_{s}^{\delta } \Biggr) = \log (n) + \frac{1}{\delta - 1}\log \Biggl(\sum _{s=1}^{n}w_{s}^{\delta } \Biggr). \end{aligned}$$
We have
$$\begin{aligned} \boldsymbol{\mathcal{H}}_{\delta }(\tilde{\mathbf{r}}) = \log (n) - \boldsymbol{\mathcal{D}}_{\delta } \biggl(\tilde{\mathbf{r}}, \frac{\textbf{1}}{ \textbf{n}} \biggr) \end{aligned}$$
(56)
and
$$\begin{aligned} \boldsymbol{\mathcal{H}}_{\delta }(\tilde{\mathbf{w}}) = \log (n) - \boldsymbol{\mathcal{D}}_{\delta } \biggl(\tilde{\mathbf{w}}, \frac{\textbf{1}}{ \textbf{n}} \biggr). \end{aligned}$$
(57)
We get (53) after using Theorem 7(i), (56) and (57).
Statements (ii) and (iii) are similarly proved. □
Corollary 9
Assume
\(\boldsymbol{\mathcal{G}}\).
Let
\(\tilde{\mathbf{r}} = (r_{1}, \ldots , r_{n})\), \(\tilde{\mathbf{k}} = (k_{1}, \ldots , k_{n})\), \(\tilde{\mathbf{w}} = (w _{1}, \ldots , w_{m})\), and
\(\tilde{\mathbf{t}} = (t_{1}, \ldots , t _{n})\)be positive probability distributions.
If either
\(\delta \in [0, 1)\)and base if log is greater than 1, or
\(\delta >1\)and base if log is less than 1, then
$$ -\frac{1}{\sum_{s=1}^{n}r_{s}^{\delta }} \sum_{s=1}^{n}r_{s}^{\delta } \log (r_{s})-\boldsymbol{\mathcal{H}}_{\delta }(r) \geq \frac{1}{\sum_{s=1}^{m}w_{s}^{\delta }}{\sum_{s=1}^{m}w_{s}^{\delta }} \log (w_{s}) -\frac{\sum_{s=1}^{m}w_{s}^{\delta }}{\sum_{s=1}^{n}r_{s}^{\delta }}\boldsymbol{\mathcal{H}}_{\delta }(w). $$
(58)
The inequality in (58) is reversed if either
\(\delta \in [0, 1)\)and base if log is less than 1, or
\(\delta > 1\)and the base of log is greater than 1.
Proof
Proof is similar to Corollary 8 □
Zipf–Mandelbrot law
In [14] the authors gave some contribution in analyzing the Zipf–Mandelbrot law which is defined as follows:
Definition 6
The Zipf–Mandelbrot law is a discrete probability distribution depending on three parameters: \(\mathcal{N} \in \{1, 2, \ldots , \}\), \(\phi \in [0, \infty )\), and \(t > 0\), and is defined by
$$\begin{aligned} f(s; \mathcal{N}, \phi , t) : = \frac{1}{(s + \phi )^{t}\boldsymbol{\mathcal{H}}_{\mathcal{N}, \phi , t}}, \quad s = 1, \ldots , \mathcal{N}, \end{aligned}$$
where
$$\begin{aligned} \boldsymbol{\mathcal{H}}_{\mathcal{N}, \phi , t} = \sum_{\nu =1}^{\mathcal{N}} \frac{1}{(\nu + \phi )^{t}}. \end{aligned}$$
For all values of \(\mathcal{N}\), if the total mass of the law is taken, then for \(0 \leq \phi \), \(1< t\), \(s \in \mathcal{N}\), the density function of the Zipf–Mandelbrot law becomes
$$\begin{aligned} f(s; \phi , t) = \frac{1}{(s + \phi )^{t}\boldsymbol{\mathcal{H}}_{ \phi , t}}, \end{aligned}$$
where
$$\begin{aligned} \boldsymbol{\mathcal{H}}_{\phi , t} = \sum_{\nu =1}^{\infty } \frac{1}{( \nu + \phi )^{t}}. \end{aligned}$$
For \(\phi = 0\), the Zipf–Mandelbrot law becomes Zipf’s law.
Conclusion 1
Assume \(\boldsymbol{\mathcal{G}}\).
Let \(\tilde{\mathbf{r}}\) and \(\tilde{\mathbf{w}}\) be the Zipf–Mandelbrot laws. By Corollary 8(iii). If \(\delta \in [0, 1)\) and base of log is greater than 1, then
$$\begin{aligned} \boldsymbol{\mathcal{S}} =&-\sum_{s=1}^{n} \frac{1}{(s+k)^{s}{\boldsymbol{\mathcal{H}}_{\mathcal{N}, k, v}}}\log \biggl(\frac{1}{(s+k)^{s}{\boldsymbol{\mathcal{H}}_{\mathcal{N}, k, v}}} \biggr)- \frac{1}{1 - \delta } \log \Biggl(\frac{1}{\boldsymbol{\mathcal{H}}_{\mathcal{N}, k, v}^{\delta }}\sum _{s=1}^{n}\frac{1}{(s + k)^{\delta s}} \Biggr) \\ \geq &\tilde{\boldsymbol{\mathcal{S}}}\\ =& -\sum_{s=1}^{m} \frac{1}{(s+w)^{s} {\boldsymbol{\mathcal{H}}_{\mathcal{N}, w, v}}}\log \biggl(\frac{1}{(s+w)^{s} {\boldsymbol{\mathcal{H}}_{\mathcal{N}, w, v}}} \biggr)- \frac{1}{1 - \delta } \log \Biggl(\frac{1}{\boldsymbol{\mathcal{H}}_{\mathcal{N}, w, v}^{ \delta }}\sum _{s=1}^{m}\frac{1}{(s + w)^{\delta s}} \Biggr). \end{aligned}$$
The inequality is reversed if base of log is less than 1.