Generalization of the Levinson inequality with applications to information theory

In the presented paper, Levinson’s inequality for the 3-convex function is generalized by using two Green functions. Čebyšev-, Grüss- and Ostrowski-type new bounds are found for the functionals involving data points of two types. Moreover, the main results are applied to information theory via the f-divergence, the Rényi divergence, the Rényi entropy, the Shannon entropy and the Zipf–Mandelbrot law.


Introduction and preliminaries
In [12], Ky Fan's inequality is generalized by Levinson for 3-convex functions as follows.
In [15], Mercer presented a notable work by replacing the condition of symmetric distribution of points x i and y i with symmetric variances of points x i and y i .The second condition is a weaker condition.
Theorem D Let f be a 3-convex function on [a, b], p k be positive such that n k=1 p k = 1.Also let x k , y k satisfy (3) Then (1) holds.
On the other hand the error function e F (t) can be represented in terms of the Green function G F ,n (t, s) of the boundary value problem where The following result holds in [1].
and let P F be its 'two-point right focal' interpolating polynomial.Then, for a ≤ a 1 < a 2 ≤ b and 0 ≤ p ≤ n -2, where G F,n (t, s) is the Green function, defined by (7).
and let P F be its 'two-point right focal' interpolating polynomial for a ≤ a 1 < a 2 ≤ b.Then, for n = 3 and p = 0, (8) becomes where For n = 3 and p = 1, (8) becomes where The presented work is organized as follows: in Sect.2, Levinson's inequality for the 3convex function is generalized by using two Green functions defined by (10) and (12).In Sect.3, Čebyšev-, Grüss-and Ostrowski-type new bounds are found for the functionals involving data points of two types.In Sect.4, the main results are applied to information theory via the f -divergence, the Rényi divergence, the Rényi entropy, the Shannon entropy and the Zipf-Mandelbrot law.

Main results
First we give an identity involving Jensen's difference of two different data points.Then we give equivalent form of identity by using Green function defined by (10) and (12). where and for G k (•, s) (k = 1, 2) defined in (10) and (12), respectively.
Using (11) in ( 14) and following similar steps as in the proof of (i), we get (13).
then the following statements are equivalent: where G k (•, s) are defined by (10)  Then we can represent the function f in the form (9). Now by means of some simple calculations we can write By the convexity of f , we have f (3) (s) ≥ 0 for all s ∈ I. Hence, if for every s ∈ I, (19) is valid then it follows that, for every 3-convex function f : and f (2) (ζ 2 ) have different signs in (17) then inequalities ( 18) and ( 19) are reversed.
Next we have results about generalization of Bullen's type inequality (for real weights) given in [2] (see also [11,16]).
Proof By choosing x ρ and y such that conditions (20) and ( 21) hold in Theorem 2, we get required result.
Next we have generalized form (for real weights) of Bullen's type inequality given in [17] (see also [16]).
Proof Using Theorem 2 with the conditions given in the statement we get the required result.
In [15], Mercer made a notable work by replacing the condition (21) of symmetric distribution of points x ρ and y with symmetric variances of points x ρ and y for ρ = 1, . . ., n and = 1, . . ., m.
So in the next result we use Mercer's condition (6), but for ρ = and m = n.
Proof For positive weights, using ( 6) and ( 20) in Theorem 2, we get required result.
Next we have results that lean on the generalization of Levinson's type inequality given in [12] (see also [16]).

New bounds for Levinson's type functionals
Consider the Čebyšev functional for two Lebesgue integrable functions f 1 , where the integrals are assumed to exist.
2 is the best possible.
is the best possible.
In the next result we construct the Čebyšev-type bound for our functional defined in (5).
Multiplying (ζ 2ζ 1 ) on both sides of the above inequality and using the estimation (29), we get Using the identity (13), we get (28).
In the next result bounds of Grüss-type inequalities are estimated.
Ostrowski-type bounds for newly constructed functional defined in (5).
Proof Rearrange identity (13) in such a way Employing the classical Hölder inequality for the R.H.S. of (34) yields (33).

Application to information theory
The idea of the Shannon entropy is the focal point of data hypothesis, now and then alluded to as the measure of uncertainty.The entropy of a random variable is characterized regarding its probability distribution and can be shown to be a decent measure of randomness or uncertainty.The Shannon entropy permits one to evaluate the normal least number of bits expected to encode a series of images dependent on the letters in order size and the recurrence of the symbols.Divergences between probability distributions have become familiar with a measure of the difference between them.A variety of sorts of divergences exist, for instance the fdifference (particularly, the Kullback-Leibler divergence, the Hellinger distance and the total variation distance), the Rényi divergence, the Jensen-Shannon divergence, and so forth (see [13,21]).There are a lot of papers managing inequalities and entropies, see, e.g., [8,10,20] and the references therein.The Jensen inequality assumes a crucial role in a part of these inequalities.In any case, Jensen's inequality deals with one sort of information focus and Levinson's inequality manages two type information points.
The Zipf law is one of the central laws in data science, and it has been utilized in linguistics.Zipf in 1932 found that we can tally how frequently each word shows up in the content.So on the off chance that we rank (r) word as per the recurrence of word event (f ), at that point the result of these two numbers is a steady (C) : C = r × f .Aside from the utilization of this law in data science and linguistics, the Zipf law is utilized in city population, sun powered flare power, site traffic, earthquack magnitude, the span of moon pits, and so forth.In financial aspects this distribution is known as the Pareto law, which analyzes the distribution of the wealthiest individuals in the community [6, p. 125].These two laws are equivalent in the mathematical sense, yet they are involved in different contexts [7, p. 294].

Csiszár divergence
In [4,5] Csiszár gave the following definition. (ii) where u=1 t u and y u = w u t u in Theorem 2, we get the required results.

Zipf-Mandelbrot law
In [14] the authors gave some contribution in analyzing the Zipf-Mandelbrot law which is defined as follows.

If 1 <
δ and the base of the log is greater than 1,then n v=1 r v log(r v ) + H δ (r) ≥ m u=1