# Refinements of the integral Jensen’s inequality generated by finite or infinite permutations

## Abstract

There are a lot of papers dealing with applications of the so-called cyclic refinement of the discrete Jensen’s inequality. A significant generalization of the cyclic refinement, based on combinatorial considerations, has recently been discovered by the author. In the present paper we give the integral versions of these results. On the one hand, a new method to refine the integral Jensen’s inequality is developed. On the other hand, the result contains some recent refinements of the integral Jensen’s inequality as elementary cases. Finally some applications to the Fejér inequality (especially the Hermite–Hadamard inequality), quasi-arithmetic means, and f-divergences are presented.

## Introduction

The significance of convex functions is rightly due to Jensen’s inequality. A real function f defined on an interval $$C\subset \mathbb{R}$$ is called convex if it satisfies

$$f \bigl( \alpha t_{1}+ ( 1-\alpha ) t_{2} \bigr) \leq \alpha f ( t_{1} ) + ( 1-\alpha ) f ( t_{2} )$$

for all $$t_{1},t_{2}\in C$$ and all $$\alpha \in [ 0,1 ]$$.

Let the set I denote either $$\{ 1,\ldots,n \}$$ for some $$n\geq 1$$ or $$\mathbb{N}_{+}$$. We say that the numbers $$( p_{i} ) _{i\in I}$$ represent a discrete probability distribution if $$p_{i}\geq 0$$ ($$i\in I$$) and $$\sum_{i\in I}p_{i}=1$$. It is called positive if $$p_{i}>0$$ ($$i\in I$$). A permutation π of I refers to a bijection from I onto itself.

The following discrete and integral versions of Jensen’s inequality are well known.

### Theorem 1

(discrete Jensen’s inequalities, see  and )

(a) Let C be a convex subset of a real vector space V, and let $$f:C\rightarrow \mathbb{R}$$ be a convex function. If $$p_{1},\ldots,p_{n}$$ represent a discrete probability distribution and $$v_{1},\ldots,v_{n}\in C$$, then

$$f \Biggl( \sum_{i=1}^{n}p_{i}v_{i} \Biggr) \leq \sum_{i=1}^{n}p_{i}f ( v_{i} ) .$$
(1)

(b) Let C be a closed convex subset of a real Banach space V, and let $$f:C\rightarrow \mathbb{R}$$ be a convex function. If $$p_{1},p_{2},\ldots$$ represent a discrete probability distribution and $$v_{1},v_{2},\ldots \in C$$ such that the series $$\sum_{i=1}^{\infty }p_{i}v_{i}$$ and $$\sum_{i=1}^{\infty }p_{i}f ( v_{i} )$$ are absolutely convergent, then

$$f \Biggl( \sum_{i=1}^{\infty }p_{i}v_{i} \Biggr) \leq \sum_{i=1}^{\infty }p_{i}f ( v_{i} ) .$$
(2)

### Theorem 2

(integral Jensen’s inequality, see )

Let φ be an integrable function on a probability space $$( X,\mathcal{A},\mu )$$ taking values in an interval $$C\subset \mathbb{R}$$. Then $${\int _{X}} \varphi \,d\mu$$ lies in C. If f is a convex function on C such that $$f \circ \varphi$$ is μ-integrable, then

$$f \biggl( { \int _{X}} \varphi \,d\mu \biggr) \leq { \int _{X}} f\circ \varphi \,d\mu .$$
(3)

### Remark 3

It follows that Theorem 1 (b) can be generalized in case of $$V=\mathbb{R}$$: if $$C\subset \mathbb{R}$$ is an interval (not necessarily closed) and the other conditions of the statement are satisfied, then $$\sum_{i=1}^{\infty }p_{i}v_{i}$$ lies in C and (2) holds.

There are many papers dealing with refinements of discrete and integral Jensen’s inequalities (see the book  and the references therein).

In papers  and  there are special refinements of the discrete Jensen’s inequality of the form in Theorem 1 (a) (so-called cyclic refinements). These led the author to the following refinement of the discrete Jensen’s inequality which is a significant generalization of the previously mentioned results.

### Theorem 4

(see )

(a) Let $$k,n\geq 2$$ be integers, and let $$p_{1},\ldots,p_{n}$$ and $$\lambda _{1},\ldots,\lambda _{k}$$ represent positive probability distributions. For each $$j=1,\ldots,k$$, let $$\pi _{j}$$ be a permutation of the set $$\{ 1,\ldots,n \}$$. If C is a convex subset of a real vector space V, $$f:C\rightarrow \mathbb{R}$$ is a convex function, and $$v_{1},\ldots,v_{n}\in C$$, then

\begin{aligned} f \Biggl( \sum_{i=1}^{n}p_{i}v_{i} \Biggr) \leq {}&C_{per}=C_{per} ( f,\mathbf{v,p},\boldsymbol{\lambda},\boldsymbol{\pi} ) \\ :={}&\sum_{i=1}^{n} \Biggl( \sum _{j=1}^{k}\lambda _{j}p_{ \pi _{j} ( i ) } \Biggr) f \biggl( \frac{\sum_{j=1}^{k}\lambda _{j}p_{\pi _{j} ( i ) }v_{\pi _{j} ( i ) }}{\sum_{j=1}^{k}\lambda _{j}p_{\pi _{j} ( i ) }} \biggr) \leq \sum _{i=1}^{n}p_{i}f ( v_{i} ) . \end{aligned}

(b) Let the set J denote either $$\{ 1,\ldots,k \}$$ for some $$k\geq 2$$ or $$\mathbb{N}_{+}$$. Let $$p_{1},p_{2},\ldots$$ and $$( \lambda _{j} ) _{j\in J}$$ represent positive probability distributions. For each $$j\in J$$, let $$\pi _{j}$$ be a permutation of the set $$\mathbb{N}_{+}$$. If C is a closed convex subset of a real Banach space $$( V, \Vert \cdot \Vert )$$, $$f:C\rightarrow \mathbb{R}$$ is a convex function, and $$v_{1},v_{2},\ldots \in C$$ such that the series $$\sum_{i=1}^{\infty }p_{i}v_{i}$$ and $$\sum_{i=1}^{\infty }p_{i}f ( v_{i} )$$ are absolutely convergent, then

\begin{aligned} f \Biggl( \sum_{i=1}^{\infty }p_{i}v_{i} \Biggr) \leq {}&C_{per}=C_{per} ( f,\mathbf{v,p}, \boldsymbol{\lambda},\boldsymbol{\pi} ) \\ :={}&\sum_{i=1}^{\infty } \biggl( \sum _{j\in J}\lambda _{j}p_{ \pi _{j} ( i ) } \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}p_{\pi _{j} ( i ) }v_{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}p_{\pi _{j} ( i ) }} \biggr) \leq \sum_{i=1}^{\infty }p_{i}f ( v_{i} ) . \end{aligned}

In the paper  we obtain refinements of the integral Jensen’s inequality by using cyclic refinements of the discrete Jensen’s inequality, but these results are not natural obverses of the discrete one’s.

In this paper we give the integral version of Theorem 4 when $$V=\mathbb{R}$$. On the one hand, a new method to refine the integral Jensen’s inequality is developed (totally different from earlier techniques, see e.g.  and ) and our result contains Theorem 4 when $$V=\mathbb{R}$$. On the other hand, we can have from it some recent refinements of the integral Jensen’s inequality (see , , ) as elementary cases. Finally, some applications to the Fejér inequality (especially the Hermite–Hadamard inequality), quasi-arithmetic means, and f-divergences are presented.

## Preliminary result

We give an extension of Theorem 4 if $$V=\mathbb{R}$$.

### Proposition 5

Let the index set I denote either $$\{ 1,\ldots,n \}$$ for some $$n\geq 1$$ or $$\mathbb{N}_{+}$$. Let the index set J denote either $$\{ 1,\ldots,k \}$$ for some $$k\geq 1$$ or $$\mathbb{N}_{+}$$. For each $$j\in J$$, let $$\pi _{j}$$ be a permutation of the set I. Let $$( p_{i} ) _{i\in I}$$ and $$( \lambda _{j} ) _{j\in J}$$ represent positive probability distributions. If C is an interval in $$\mathbb{R}$$, $$f:C\rightarrow \mathbb{R}$$ is a convex function, and $$( v_{i} ) _{i\in I}$$ is a sequence from C such that the series $$\sum_{i\in I}p_{i}v_{i}$$ and $$\sum_{i\in I}p_{i}f ( v_{i} )$$ are absolutely convergent, then

\begin{aligned} f \biggl( \sum_{i\in I}p_{i}v_{i} \biggr) \leq {}&C_{per}=C_{per} ( f,\mathbf{v,p},\boldsymbol{\lambda},\boldsymbol{\pi} ) \\ :={}&\sum_{i\in I} \biggl( \sum _{j\in J}\lambda _{j}p_{ \pi _{j} ( i ) } \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}p_{\pi _{j} ( i ) }v_{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}p_{\pi _{j} ( i ) }} \biggr) \leq \sum_{i\in I}p_{i}f ( v_{i} ) . \end{aligned}

### Proof

By using Remark 3, we can copy the proof of Theorem 4 in . □

The positive part $$f^{+}$$ and the negative part $$f^{-}$$ of a real-valued function f are defined in the usual way.

We need another result about integrability.

### Lemma 6

Let φ be an integrable function on a probability space $$( X,\mathcal{A},\mu )$$ taking values in an interval $$C\subset \mathbb{R}$$. If f is a convex function on C such that $$f\circ \varphi$$ is μ-integrable, then there exists a convex function g on C such that $$\vert f \vert \leq g$$ and $$g\circ \varphi$$ is μ-integrable too.

### Proof

Along with the function f, the function $$f^{+}$$ is also convex. Since $$f\circ \varphi$$ is μ-integrable, $$f^{+}\circ \varphi$$ is also μ-integrable.

The convexity of f on C shows that there is an affine function $$l:C\rightarrow \mathbb{R}$$, $$l ( t ) =at+b$$ for which

$$f ( t ) \geq at+b,\quad t\in C.$$

Then

\begin{aligned} f^{-} ( t ) ={}&\max \bigl( -f ( t ) ,0 \bigr) \leq \max ( -at-b,0 ) =l^{-} ( t ) \\ \leq {}&\vert at+b \vert = \bigl\vert l ( t ) \bigr\vert ,\quad t\in C. \end{aligned}

Since the function φ is a μ-integrable function, $$\vert l \vert \circ \varphi$$ is also μ-integrable.

Using that the function $$\vert l \vert$$ is convex, $$\vert f \vert =f^{+}+f^{-}$$, and the sum of two convex functions is also convex, it follows from the above that $$g:=f^{+}+ \vert l \vert$$ can be chosen.

The proof is complete. □

We shall use the following Fubini theorem for double series.

### Theorem 7

(see )

Let $$a ( i,j ) \in \mathbb{R}$$ ($$( i,j ) \in \mathbb{N}_{+}\times \mathbb{N}_{+}$$). If any of the two sums

$$\sum_{i=1}^{\infty } \Biggl( \sum _{j=1}^{\infty } \bigl\vert a ( i,j ) \bigr\vert \Biggr) ,\quad\quad \sum_{j=1}^{\infty } \Biggl( \sum _{i=1}^{\infty } \bigl\vert a ( i,j ) \bigr\vert \Biggr)$$

is finite, then both of the series

$$\sum_{i=1}^{\infty } \Biggl( \sum _{j=1}^{\infty }a ( i,j ) \Biggr) ,\quad\quad \sum _{j=1}^{\infty } \Biggl( \sum_{i=1}^{\infty }a ( i,j ) \Biggr)$$

are absolutely convergent and both have the same sum.

## Main results

We need the following hypotheses.

($$\mathrm{H}_{1}$$):

Let $$( X,\mathcal{A},\mu )$$ be a probability space.

($$\mathrm{H}_{2}$$):

Let the index set I denote either $$\{ 1,\ldots,n \}$$ for some $$n\geq 1$$ or $$\mathbb{N}_{+}$$. Let the index set J denote either $$\{ 1,\ldots,k \}$$ for some $$k\geq 1$$ or $$\mathbb{N}_{+}$$.

($$\mathrm{H}_{3}$$):

Let $$( \lambda _{j} ) _{j\in J}$$ represent a positive probability distribution. For each $$j\in J$$, let $$\pi _{j}$$ be a permutation of the set I.

($$\mathrm{H}_{4}$$):

Suppose that we are given a sequence $$\mathfrak{M}_{I}= ( \mu _{i} ) _{i\in I}$$ of measures on $$\mathcal{A}$$ with $$\mu _{i} ( X ) >0$$ for all $$i\in I$$ and $$\sum_{i\in I}\mu _{i}=\mu$$.

($$\mathrm{H}_{5}$$):

Suppose that we are given a sequence $$\mathfrak{S}_{I}= ( A_{i} ) _{i\in I}$$ of pairwise disjoint sets $$A_{i}\in \mathcal{A}$$ with $$\mu ( A_{i} ) >0$$ for all $$i\in I$$ and $$\bigcup_{i\in I}A_{i}=X$$.

### Theorem 8

Assume ($$\mathrm{H}_{1}$$$$\mathrm{H}_{3}$$) and ($$\mathrm{H}_{4}$$). Let $$C\subset \mathbb{R}$$ be an interval, and $$f:C\rightarrow \mathbb{R}$$ be a convex function. Let φ be a μ-integrable function on X taking values in C such that $$f\circ \varphi$$ is also μ-integrable on X. Then

\begin{aligned} f \biggl( \int _{X}\varphi \,d\mu \biggr) \leq {}&C_{mes}=C_{mes} ( f,\varphi ,\boldsymbol{\lambda},\boldsymbol{\pi},\mathfrak{M}_{I} ) \\ :={}&\sum_{i\in I} \biggl( \sum _{j\in J}\lambda _{j} \mu _{\pi _{j} ( i ) } ( X ) \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{X}\varphi \,d\mu _{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } ( X ) } \biggr) \leq \int _{X}f\circ \varphi \,d \mu . \end{aligned}

### Proof

This can be obtained by an application of Proposition 5 to the parameters

$$p_{i}:=\mu _{i} ( X ) \quad \text{and}\quad v_{i}:= \frac{1}{\mu _{i} ( X ) } \int _{X}\varphi \,d\mu _{i},\quad i\in I.$$

Really, $$( p_{i} ) _{i\in I}$$ represents a positive probability distribution, and by the integral Jensen’s inequality, $$v_{i}\in C$$ ($$i\in I$$).

Next we show that the series $$\sum_{i\in I}p_{i}v_{i}$$ and $$\sum_{i\in I}p_{i}f ( v_{i} )$$ are absolutely convergent.

Since φ is a μ-integrable function on X and $$\sum_{i\in I}\mu _{i}=\mu$$,

$$\sum_{i\in I}p_{i} \vert v_{i} \vert =\sum_{i\in I} \biggl\vert \int _{X}\varphi \,d\mu _{i} \biggr\vert \leq \sum _{i\in I} \int _{X} \vert \varphi \vert \,d\mu _{i}= \int _{X} \vert \varphi \vert \,d\mu < \infty .$$

By Lemma 6, there exists a convex function g on C such that

$$\vert f \vert \leq g\quad \text{and}\quad g\circ \varphi \text{ is }\mu \text{-integrable.}$$
(4)

Another application of the integral Jensen’s inequality and $$\sum_{i\in I}\mu _{i}=\mu$$ now show that

\begin{aligned} \sum_{i\in I}p_{i} \bigl\vert f ( v_{i} ) \bigr\vert ={}&\sum_{i\in I}\mu _{i} ( X ) \biggl\vert f \biggl( \frac{1}{\mu _{i} ( X ) } \int _{X}\varphi \,d\mu _{i} \biggr) \biggr\vert \\ \leq {}&\sum_{i\in I}\mu _{i} ( X ) g \biggl( \frac{1}{\mu _{i} ( X ) } \int _{X}\varphi \,d\mu _{i} \biggr) \leq \sum _{i\in I} \int _{X}g\circ \varphi \,d \mu _{i}= \int _{X}g\circ \varphi \,d\mu < \infty . \end{aligned}

We can see that the conditions of Proposition 5 hold, and therefore, by applying it, we obtain

\begin{aligned} f \biggl( \int _{X}\varphi \,d\mu \biggr)&=f \biggl( \sum _{i\in I}p_{i}f ( v_{i} ) \biggr) \leq \sum _{i\in I} \biggl( \sum_{j\in J} \lambda _{j}p_{\pi _{j} ( i ) } \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}p_{\pi _{j} ( i ) }v_{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}p_{\pi _{j} ( i ) }} \biggr) \\ &=\sum_{i\in I} \biggl( \sum _{j\in J}\lambda _{j}\mu _{ \pi _{j} ( i ) } ( X ) \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{X}\varphi \,d\mu _{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } ( X ) } \biggr) \leq \sum_{i\in I}p_{i}f ( v_{i} ) \\ &=\sum_{i\in I}\mu _{i} ( X ) f \biggl( \frac{1}{\mu _{i} ( X ) } \int _{X}\varphi \,d\mu _{i} \biggr) . \end{aligned}
(5)

As a final step, we can apply the integral Jensen’s inequality in (5).

The proof is complete. □

A useful consequence of the previous theorem is the next result.

### Corollary 9

Assume ($$\mathrm{H}_{1}$$$$\mathrm{H}_{3}$$) and ($$\mathrm{H}_{5}$$). Let $$C\subset \mathbb{R}$$ be an interval and $$f:C\rightarrow \mathbb{R}$$ be a convex function. Let φ be a μ-integrable function on X taking values in C such that $$f\circ \varphi$$ is also μ-integrable on X. Then

\begin{aligned} f \biggl( \int _{X}\varphi \,d\mu \biggr) \leq{}& C_{set}=C_{set} ( f,\varphi ,\boldsymbol{\lambda},\boldsymbol{\pi},\mathfrak{S}_{I} ) \\ :={}&\sum_{i\in I} \biggl( \sum _{j\in J}\lambda _{j} \mu ( A_{\pi _{j} ( i ) } ) \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{A_{\pi _{j} ( i ) }}\varphi \,d\mu }{\sum_{j\in J}\lambda _{j}\mu ( A_{\pi _{j} ( i ) } ) } \biggr) \leq \int _{X}f\circ \varphi \,d\mu . \end{aligned}
(6)

### Proof

Let the measure $$\mu _{i}$$ ($$i\in I$$) be defined on $$\mathcal{A}$$ by

$$\mu _{i} ( A ) :=\mu ( A\cap A_{i} ) ,\quad A \in \mathcal{A},$$

and then apply Theorem 8.

The proof is complete. □

## Discussion

First we study the relationship between Theorem 8 and Proposition 5.

Assume ($$\mathrm{H}_{2}$$) and ($$\mathrm{H}_{3}$$), and let $$( p_{i} ) _{i\in I}$$ represent a positive probability distribution. Define the measure μ on the power set $$P ( I )$$ of I by

$$\mu :=\sum_{i\in I}p_{i}\varepsilon _{i},$$

where $$\varepsilon _{i}$$ ($$i\in I$$) is the unit mass at i on $$P ( I )$$, and use the measure space $$( I,P ( I ) ,\mu )$$ in ($$\mathrm{H}_{1}$$). Let $$C\subset \mathbb{R}$$ be an interval, $$f:C\rightarrow \mathbb{R}$$ be a convex function, and $$( v_{i} ) _{i\in I}$$ be a sequence in $$\mathbb{R}$$ such that the series $$\sum_{i\in I}p_{i}v_{i}$$ and $$\sum_{i\in I}p_{i}f ( v_{i} )$$ are absolutely convergent. Define the function φ on I by

$$\varphi ( i ) :=v_{i}.$$

Finally, choose $$A_{i}:= \{ i \}$$ ($$i\in I$$).

It is easy to check that under these conditions Corollary 9 is equivalent to Proposition 5.

Now we compare our main result with some recent refinement of the integral Jensen’s inequality.

Let $$( X,\mathcal{A},\nu )$$ be a measure space with $$\nu ( X ) \in \mathopen] 0,\infty \mathclose]$$. For the ν-integrable positive ν-a.e. weight w, consider the Lebesgue space

$$L_{w} ( X,\nu ) := \biggl\{ \varphi :X\rightarrow \mathbb{R}, \varphi \text{ is }\nu \text{-measurable and } \int _{X} \vert \varphi \vert w\,d\nu < \infty \biggr\} .$$

For the ν-integrable positive ν-a.e. weight w and given $$n\geq 2$$, we consider the set $$\mathfrak{B}_{k} ( w )$$ of all possible n-tuples of ν-integrable positive ν-a.e. weights $$\overline{w}= ( w_{1},\ldots,w_{n} )$$ with the property that $$\sum_{i=1}^{n}w_{j}=w$$.

The next result can be found in .

### Theorem 10

(see , Theorem 2.1)

Let $$f: [ m,M ] \rightarrow \mathbb{R}$$ be a convex function, $$\varphi :X\rightarrow [ m,M ]$$ be a ν-measurable function such that φ, $$f\circ \varphi \in L_{w} ( X,\nu )$$. Then, for any $$\overline{w}\in \mathfrak{B}_{k} ( w )$$, we have

$$f \biggl( \frac{\int _{X}\varphi w\,d\nu }{\int _{X}w\,d\nu } \biggr) \leq \frac{1}{\int _{X}w\,d\nu }\sum _{i=1}^{n}f \biggl( \frac{\int _{X}\varphi w_{i}\,d\nu }{\int _{X}w_{i}\,d\nu } \biggr) \int _{X}w_{i}\,d\nu \leq \frac{\int _{X} ( f\circ \varphi ) w\,d\nu }{\int _{X}w\,d\nu },$$

where $$n\geq 2$$.

### Remark 11

Let the measure $$\mu _{i}$$ ($$i\in I$$) and μ be defined on $$\mathcal{A}$$ by

$$\mu _{i} ( A ) :=\frac{1}{\int _{X}w\,d\nu } \int _{A}w_{i}\,d\nu ,\quad A\in \mathcal{A},$$

and by

$$\mu ( A ) :=\frac{1}{\int _{X}w\,d\nu } \int _{A}w\,d\nu ,\quad A\in \mathcal{A}.$$
(7)

Then $$\mu _{i} ( X ) >0$$ ($$i\in I$$) and $$\mu =\sum_{i=1}^{n}\mu _{i}$$. By choosing $$k=1$$ (thus $$\lambda _{1}=1$$), we can see that Theorem 10 is a simple consequence of Theorem 8.

We say that the family of measurable sets $$F_{n} ( X ) = \{ A_{i} \} _{i=1,\ldots,n}$$ is an n-division for X if $$X=\bigcup_{i=1}^{n}A_{i}$$, $$A_{i}\cap A_{j}=\emptyset$$ for any $$i,j\in \{ 1,\ldots,n \}$$ with $$i\neq j$$ and $$\nu ( A_{i} ) >0$$ for any $$i\in \{ 1,\ldots,n \}$$. For given $$n\geq 2$$, we denote by $$\mathfrak{D}_{n} ( X )$$ the set of all n-divisions of X.

The following result appears in .

### Theorem 12

(see , Theorem 2.1)

Let $$f: [ m,M ] \rightarrow \mathbb{R}$$ be a convex function, $$\varphi :X\rightarrow [ m,M ]$$ be a ν-measurable function such that φ, $$f\circ \varphi \in L_{w} ( X,\nu )$$. Then, for any $$F_{n} ( X ) \in \mathfrak{D}_{n} ( X )$$, we have

$$f \biggl( \frac{\int _{X}fw\,d\nu }{\int _{X}w\,d\nu } \biggr) \leq \frac{1}{\int _{X}w\,d\nu }\sum _{i=1}^{n}f \biggl( \frac{\int _{A_{i}}fw\,d\nu }{\int _{A_{i}}w\,d\nu } \biggr) \int _{A_{i}}w\,d\nu \leq \frac{\int _{X} ( f\circ \varphi ) w\,d\nu }{\int _{X}w\,d\nu },$$

where $$n\geq 2$$.

### Remark 13

(a) Define the measure μ on $$\mathcal{A}$$ by (7). By choosing $$k=1$$ (thus $$\lambda _{1}=1$$), we can see that Theorem 12 is a simple consequence of Corollary 9. Moreover, it follows that Theorem 12 is contained in Theorem 10.

(b) The main result Theorem 2.1 in  is the special case of Corollary 9 when $$n=2$$ and $$k=1$$.

## Applications

Let $$[ a,b ] \subset \mathbb{R}$$ ($$a< b$$). The σ-algebra of Lebesgue-measurable subsets of $$\mathbb{R}$$ is denoted by $$\mathcal{L}$$. λ means the Lebesgue measure on $$\mathcal{L}$$. Assume that $$f: [ a,b ] \rightarrow \mathbb{R}$$ is a convex function and $$g: [ a,b ] \rightarrow \mathopen[ 0,\infty \mathclose[$$ is a Lebesgue-integrable function which is symmetric to $$\frac{a+b}{2}$$. The classical Fejér inequality (see ) says

$$f \biggl( \frac{a+b}{2} \biggr) \int _{a}^{b}g\leq { \int _{a}^{b}} fg\leq \frac{f ( a ) +f ( b ) }{2} \int _{a}^{b}g.$$
(8)

This is a weighted generalization of the Hermite–Hadamard inequality (see ) which has the form

$$f \biggl( \frac{a+b}{2} \biggr) \leq \frac{1}{b-a}{ \int _{a}^{b}} f\leq \frac{f ( a ) +f ( b ) }{2}.$$
(9)

By applying our main result, we can obtain a refinement of the left-hand side of the Fejér inequality. There are refinements of the Fejér inequality (see e.g. ), but the next result provides a totally different refinement.

### Proposition 14

Assume ($$\mathrm{H}_{2}$$$$\mathrm{H}_{3}$$). Let $$[ a,b ] \subset \mathbb{R}$$ ($$a< b$$), and consider the measure space $$( [ a,b ] ,\mathcal{L},\lambda )$$ in ($$\mathrm{H}_{1}$$). Let $$f: [ a,b ] \rightarrow \mathbb{R}$$ be a convex function and $$g: [ a,b ] \rightarrow \mathopen[ 0,\infty \mathclose[$$ be a Lebesgue-integrable function which is symmetric to $$\frac{a+b}{2}$$.

(a) If $$\mu :=g\lambda$$, where is the measure on $$\mathcal{L}$$ having density g with respect to λ, and $$\mu _{i}$$ ($$i\in I$$) are measures on $$\mathcal{L}$$ such that $$\mu _{i} ( [ a,b ] ) >0$$ for all $$i\in I$$ and $$\sum_{i\in I}\mu _{i}=\mu$$, then

\begin{aligned} f \biggl( \frac{a+b}{2} \biggr) \int _{a}^{b}g&\leq \sum _{i\in I} \biggl( \sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } \bigl( [ a,b ] \bigr) \biggr) \\ &\quad{} \cdot f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{ [ a,b ] }t\,d\mu _{\pi _{j} ( i ) } ( t ) }{\sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } ( [ a,b ] ) } \biggr) \leq { \int _{a}^{b}} fg. \end{aligned}

(b) If $$g_{i}: [ a,b ] \rightarrow \mathopen[ 0,\infty \mathclose[$$ ($$i\in I$$) is a Lebesgue-integrable function which is symmetric to $$\frac{a+b}{2}$$, and $$\sum_{i\in I}g_{i}=g$$, then

$$f \biggl( \frac{a+b}{2} \biggr) \int _{a}^{b}g\leq f \biggl( \frac{a+b}{2} \biggr) \sum_{i\in I} \biggl( \sum _{j\in J} \lambda _{j}\int _{a}^{b}g_{\pi _{j} ( i ) } \biggr) \leq { \int _{a}^{b}} fg.$$

### Proof

(a) The result follows immediately from Theorem 8 by choosing $$\varphi : [ a,b ] \rightarrow \mathbb{R}$$, $$\varphi ( t ) =t$$.

(b) It comes from (a) by some easy calculations.

The proof is complete. □

### Remark 15

It is worth to mention that in part (a) the measures may not be symmetric to $$\frac{a+b}{2}$$.

By applying the previous result, we obtain refinements of the Hermite–Hadamard inequality.

### Proposition 16

Assume ($$\mathrm{H}_{2}$$$$\mathrm{H}_{3}$$). Let $$C:= [ a,b ] \subset \mathbb{R}$$ ($$a< b$$), and consider the measure space $$( [ a,b ] ,\mathcal{L},\lambda )$$ in ($$\mathrm{H}_{1}$$). Let $$f: [ a,b ] \rightarrow \mathbb{R}$$ be a convex function.

(a) If $$\mu _{i}$$ ($$i\in I$$) is a measure on $$\mathcal{L}$$ such that $$\mu _{i} ( [ a,b ] ) >0$$ for all $$i\in I$$ and $$\sum_{i\in I}\mu _{i}=\lambda$$, then

\begin{aligned} f \biggl( \frac{a+b}{2} \biggr) &\leq \frac{1}{b-a}\sum _{i\in I} \biggl( \sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } \bigl( [ a,b ] \bigr) \biggr) \\ &\quad{} \cdot f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{ [ a,b ] }t\,d\mu _{\pi _{j} ( i ) } ( t ) }{\sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } ( [ a,b ] ) } \biggr) \leq \frac{1}{b-a}{ \int _{a}^{b}} f. \end{aligned}

(b) Let $$( A_{i} ) _{i\in I}$$ be a sequence of pairwise disjoint sets $$A_{i}\in \mathcal{A}$$ with $$\lambda ( A_{i} ) >0$$ for all $$i\in I$$ and $$\bigcup_{i\in I}A_{i}= [ a,b ]$$. Then

\begin{aligned} f \biggl( \frac{a+b}{2} \biggr) &\leq \frac{1}{b-a}\sum _{i\in I} \biggl( \sum_{j\in J}\lambda _{j}\lambda ( A_{\pi _{j} ( i ) } ) \biggr) \\ &\quad{} \cdot f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{A_{\pi _{j} ( i ) }}t\,dt}{\sum_{j\in J}\lambda _{j}\lambda ( A_{\pi _{j} ( i ) } ) } \biggr) \leq \frac{1}{b-a}{ \int _{a}^{b}} f. \end{aligned}

(c) If $$a=x_{0}< x_{1}<\cdots <x_{n}=b$$ is a partition of $$[ a,b ]$$, then

\begin{aligned} f \biggl( \frac{a+b}{2} \biggr) &\leq \frac{1}{b-a}\sum _{i\in 1}^{n} \biggl( \sum_{j\in J} \lambda _{j} ( x_{\pi _{j} ( i ) }-x_{\pi _{j} ( i ) -1} ) \biggr) \\ &\quad{} \cdot f \biggl( \frac{1}{2} \frac{\sum_{j\in J}\lambda _{j} ( x_{\pi _{j} ( i ) }^{2}-x_{\pi _{j} ( i ) -1}^{2} ) }{\sum_{j\in J}\lambda _{j} ( x_{\pi _{j} ( i ) }-x_{\pi _{j} ( i ) -1} ) } \biggr) \leq \frac{1}{b-a}{ \int _{a}^{b}} f. \end{aligned}

### Proof

(a) This is a special case of Proposition 14 (a).

(b) Let the measure $$\mu _{i}$$ ($$i\in I$$) be defined on $$\mathcal{L}$$ by

$$\mu _{i} ( A ) :=\lambda ( A\cap A_{i} ) , \quad A\in \mathcal{L},$$

and then apply (a).

(c) Let $$A_{i}:=\mathopen [ x_{i-1},x_{i}\mathclose[$$ ($$i=1,\ldots,n-1$$) and $$A_{n}:= [ x_{n-1},x_{n} ]$$ in (b).

The proof is complete. □

The second application concerns quasi-arithmetic means.

Let $$C\subset \mathbb{R}$$ be an interval, and let $$q:C\rightarrow \mathbb{R}$$ be a continuous and strictly monotone function. If $$( X,\mathcal{A},\mu )$$ is a probability space, and $$\varphi :X\rightarrow C$$ is a function such that $$q\circ \varphi$$ is μ-integrable on X, then

$$M_{q} ( \varphi ,\mu ) :=q^{-1} \biggl( \int _{X}q \circ \varphi \,d\mu \biggr)$$

is called the quasi-arithmetic mean (integral q-mean) of φ.

Now we introduce some new quasi-arithmetic means related to the formula $$C_{mes}$$.

### Definition 17

Assume ($$\mathrm{H}_{1}$$$$\mathrm{H}_{4}$$). Let $$C\subset \mathbb{R}$$ be an interval, let $$q,r:C\rightarrow \mathbb{R}$$ be continuous and strictly monotone functions, and $$\varphi :X\rightarrow C$$ be a μ-integrable function for which $$q\circ \varphi$$ and $$r\circ \varphi$$ are also μ-integrable functions. Then we define the following quasi-arithmetic mean of φ with respect to $$C_{mes}$$:

\begin{aligned}& M_{q,r} ( \varphi ,\mu ,\boldsymbol{\lambda},\boldsymbol{\pi},\mathfrak{M}_{I} ) \\& \quad :=q^{-1} \biggl( \sum_{i\in I} \biggl( \sum _{j\in J} \lambda _{j}\mu _{\pi _{j} ( i ) } ( X ) \biggr) q\circ r^{-1} \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{X}r\circ \varphi \,d\mu _{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } ( X ) } \biggr) \biggr) . \end{aligned}

In the next result the introduced means are compared.

### Proposition 18

Assume that the conditions in Definition 17hold. If either $${q\circ }r{{^{-1}}}$$ is convex and q is strictly increasing, or $${q\circ }r{{^{-1}}}$$ is concave and q is strictly decreasing, then

$$M_{r} ( \varphi ,\mu ) \leq M_{q,r} ( \varphi ,\mu , \boldsymbol{\lambda},\boldsymbol{\pi},\mathfrak{M}_{I} ) \leq M_{q} ( \varphi ,\mu ) ,$$

while if either $${r\circ }q{{^{-1}}}$$ is convex and r is strictly decreasing, or $${r\circ }q{{^{-1}}}$$ is concave and r is strictly increasing, then

$$M_{q} ( \varphi ,\mu ) \leq M_{r,q} ( \varphi ,\mu , \boldsymbol{\lambda},\boldsymbol{\pi},\mathfrak{M}_{I} ) \leq M_{r} ( \varphi ,\mu ) .$$

### Proof

We consider only the case when $${q\circ }r{{^{-1}}}$$ is convex and q is strictly increasing. By applying Theorem 8 with $$f:=q\circ r^{-1}$$ and with $$r\circ \varphi$$ instead of φ, we obtain

\begin{aligned}& q\circ r^{-1} \biggl( \int _{X}r\circ \varphi \,d\mu \biggr) \\& \quad \leq \sum_{i\in I} \biggl( \sum _{j\in J}\lambda _{j} \mu _{\pi _{j} ( i ) } ( X ) \biggr) q\circ r^{-1} \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{X}r\circ \varphi \,d\mu _{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}\mu _{\pi _{j} ( i ) } ( X ) } \biggr) \leq \int _{X}q\circ \varphi \,d\mu , \end{aligned}

and this implies the result since $$q{{^{-1}}}$$ is strictly increasing.

The proof is complete. □

Finally, some applications to information theory are presented.

Throughout the rest of the paper probability measures P and Q are defined on a fixed measurable space $$( X,\mathcal{A} )$$. It is also assumed that P and Q are absolutely continuous with respect to a σ-finite measure ν on $$\mathcal{A}$$. The densities (or Radon–Nikodym derivatives) of P and Q with respect to ν are denoted by p and q, respectively. These densities are ν-almost everywhere uniquely determined.

Introduce the set of functions

$$F:= \bigl\{ f: \mathopen] 0,\infty \mathclose [ \rightarrow \mathbb{R}\mid f \text{ is convex} \bigr\} ,$$

and define, for every $$f\in F$$, the function

$$f^{\ast }: \mathopen] 0,\infty \mathclose [ \rightarrow \mathbb{R},\quad\quad f^{ \ast } ( t ) :=tf \biggl( \frac{1}{t} \biggr) .$$

If $$f\in F$$, then either f is monotonic or there exists a point $$t_{0}\in \mathopen] 0,\infty \mathclose[$$ such that f is decreasing on $$\mathopen] 0,t_{0}\mathclose[$$. This implies that the limit

$$\lim_{t\rightarrow 0+}f ( t )$$

exists in $$\mathopen] -\infty ,\infty \mathclose]$$, and

$$f ( 0 ) :=\lim_{t\rightarrow 0+}f ( t )$$

extends f into a convex function on $$\mathopen[ 0,\infty \mathclose[$$.

It is well known that, for every $$f\in F$$, the function $$f^{\ast }$$ also belongs to F, and therefore

$$f^{\ast } ( 0 ) :=\lim_{t\rightarrow 0+}f^{\ast } ( t ) = \lim_{u\rightarrow \infty } \frac{f ( u ) }{u}.$$

The important notions of f-divergence were introduced in , , and independently in .

### Definition 19

(a) For every $$f\in F$$, we define the f-divergence of P and Q by

$$D_{f} ( P,Q ) :={ \int _{X}} q ( \omega ) f \biggl( \frac{p ( \omega ) }{q ( \omega ) } \biggr) \,d \nu ( \omega ) ,$$

where the following conventions are used:

$$0f \biggl( \frac{x}{0} \biggr) :=xf^{\ast } ( 0 ) \quad \text{if }x>0, \quad\quad 0f \biggl( \frac{0}{0} \biggr) =0f^{\ast } ( 0 ) :=0.$$
(10)

(b) Let $$f\in F$$ be a positive convex function, and let $$\mathbf{p}:= ( p_{1},\ldots,p_{n} )$$ and $$\mathbf{q}:= ( q_{1},\ldots,q_{n} )$$ represent positive discrete probability distributions. The f-divergence functional of p and q is

$$I_{f}(\mathbf{p},\mathbf{q}):={ \sum _{i=1}^{n}} q_{i}f \biggl( \frac{p_{i}}{q_{i}} \biggr) .$$

It is possible to use nonnegative discrete probability distributions in the f-divergence functional by defining

$$f ( 0 ) :=\lim_{t\rightarrow 0+}f ( t ) ;\quad\quad 0f \biggl( \frac{0}{0} \biggr) :=0;\quad\quad 0f \biggl( \frac{a}{0} \biggr) :=\lim _{t\rightarrow 0+}tf \biggl( \frac{a}{t} \biggr) ,\quad a>0.$$

### Remark 20

(a) For every $$f\in F$$, the perspective $$\hat{f}:\mathopen] 0,\infty \mathclose[ \times \mathopen] 0,\infty \mathclose[ \rightarrow \mathbb{R}$$ of f is defined by

$$\hat{f} ( x,y ) :=yf \biggl( \frac{x}{y} \biggr) .$$

Then (see ) is also a convex function. Vajda  proved that (10) is the unique rule leading to convex and lower semicontinuous extension of to the set

$$\bigl\{ ( x,y ) \in \mathbb{R}^{2}\mid x,y\geq 0 \bigr\} .$$

(b) Since $$f^{\ast } ( 0 ) \in \mathopen ] -\infty ,\infty \mathclose ]$$, Lemma 2.8 in  shows that $$D_{f} ( P,Q )$$ exists in $$\mathopen ] -\infty ,\infty \mathclose ]$$ and

$$D_{f} ( P,Q ) ={ \int _{ ( q>0 ) }} f \biggl( \frac{p ( \omega ) }{q ( \omega ) } \biggr) \,dQ ( \omega ) +f^{\ast } ( 0 ) P ( q=0 ) .$$
(11)

It follows that if P is absolutely continuous with respect to Q, then

$$D_{f} ( P,Q ) ={ \int _{ ( q>0 ) }} f \biggl( \frac{p ( \omega ) }{q ( \omega ) } \biggr) \,dQ ( \omega ) .$$

The basic inequality (see )

$$D_{f} ( P,Q ) \geq f ( 1 )$$
(12)

is one of the key properties of f-divergences. In the next result a refinement of this inequality is obtained.

### Proposition 21

Assume ($$\mathrm{H}_{2}$$) and ($$\mathrm{H}_{3}$$). Suppose that we are given a sequence $$\mathfrak{M}_{I}= ( Q_{i} ) _{i\in I}$$ of measures on $$\mathcal{A}$$ with $$Q_{i} ( X ) >0$$ for all $$i\in I$$ and $$\sum_{i\in I}Q_{i}=Q$$. Then

\begin{aligned} D_{f} ( P,Q ) \geq {}&\sum_{i\in I} \biggl( \sum_{j\in J}\lambda _{j}Q_{\pi _{j} ( i ) } ( q>0 ) \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{ ( q>0 ) }\frac{p}{q}\,dQ_{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}Q_{\pi _{j} ( i ) } ( q>0 ) } \biggr) \\ &{}+f^{\ast } ( 0 ) P ( q=0 ) \geq f \biggl( { \int _{ ( q>0 ) }} p\,d\nu \biggr) +f^{ \ast } ( 0 ) P ( q=0 ) \geq f ( 1 ) . \end{aligned}
(13)

### Proof

Since Q is absolutely continuous with respect to ν, it follows from $$\sum_{i\in I}Q_{i}=Q$$ that $$Q_{i}$$ is also absolutely continuous with respect to ν for all $$i\in I$$. By denoting the density (or Radon–Nikodym derivative) of $$Q_{i}$$ with respect to ν by $$q_{i}$$ ($$i\in I$$), we also obtain that $$\sum_{i\in I}q_{i}=q$$ ν-a.e.

If $$D_{f} ( P,Q ) =\infty$$, then (13) is obvious.

If $$D_{f} ( P,Q ) \in \mathbb{R}$$, then the integral

$${ \int _{ ( q>0 ) }} f \biggl( \frac{p ( \omega ) }{q ( \omega ) } \biggr) \,dQ ( \omega )$$
(14)

is finite, and therefore either $$Q ( p=0 ) =0$$ or $$Q ( p=0 ) >0$$ and $$f ( 0 )$$ is finite. It can be seen that the integral Jensen’s inequality can be applied to this integral, and we have from Theorem 8 that

\begin{aligned} D_{f} ( P,Q ) \geq {}&\sum_{i\in I} \biggl( \sum_{j\in J}\lambda _{j}Q_{\pi _{j} ( i ) } ( q>0 ) \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{ ( q>0 ) }\frac{p}{q}\,dQ_{\pi _{j} ( i ) }}{\sum_{j\in J}\lambda _{j}Q_{\pi _{j} ( i ) } ( q>0 ) } \biggr) \\ &{}+f^{\ast } ( 0 ) P ( q=0 ) \geq f \biggl( { \int _{ ( q>0 ) }} p\,d\nu \biggr) +f^{ \ast } ( 0 ) P ( q=0 ) . \end{aligned}
(15)

It follows from the proof of Theorem 2.12 (a) in  that

$$f \biggl( { \int _{ ( q>0 ) }} p\,d\nu \biggr) +f^{\ast } ( 0 ) P ( q=0 ) \geq f ( 1 ) .$$

The proof is complete. □

### Remark 22

(a) By using density functions, (13) can be rewritten in the following form:

\begin{aligned} D_{f} ( P,Q ) &\geq \sum_{i\in I} \biggl( \sum_{j\in J}\lambda _{j} \int _{ ( q>0 ) }q_{ \pi _{j} ( i ) }\,d\nu \biggr) f \biggl( \frac{\sum_{j\in J}\lambda _{j}\int _{ ( q>0 ) }\frac{p}{q}q_{\pi _{j} ( i ) }\,d\nu }{\sum_{j\in J}\lambda _{j}\int _{ ( q>0 ) }q_{\pi _{j} ( i ) }\,d\nu } \biggr) \\ &\quad{} +f^{\ast } ( 0 ) P ( q=0 ) \geq f \biggl( { \int _{ ( q>0 ) }} p\,d\nu \biggr) +f^{ \ast } ( 0 ) P ( q=0 ) \geq f ( 1 ) . \end{aligned}

(b) Define

$$q_{i}:=\sum_{j\in J}\lambda _{j}Q_{\pi _{j} ( i ) } ( q>0 ) ,\quad i\in I,\quad\quad \widehat{q}:=Q ( q=0 ) ,$$

and

$$p_{i}:=\sum_{j\in J}\lambda _{j} \int _{ ( q>0 ) }\frac{p}{q}\,dQ_{\pi _{j} ( i ) },\quad i\in I,\quad\quad \widehat{p}:=P ( q=0 ) .$$

By applying ($$\mathrm{H}_{3}$$), ($$\mathrm{H}_{4}$$), Theorem 7, and $$\sum_{i\in I}Q_{i}=Q$$, we have

\begin{aligned} \sum_{i\in I}q_{i}&=\sum _{i\in I} \biggl( \sum_{j\in J}\lambda _{j}Q_{\pi _{j} ( i ) } ( q>0 ) \biggr) =\sum _{j\in J}\lambda _{j} \biggl( \sum _{i\in I}Q_{\pi _{j} ( i ) } ( q>0 ) \biggr) = \\ &=\sum_{j\in J}\lambda _{j} \biggl( \sum _{i\in I}Q_{i} ( q>0 ) \biggr) =Q ( q>0 ) , \end{aligned}

and hence $$( q_{i} ) _{i\in I}$$ and represent a discrete probability distribution q.

Similarly, it follows from ($$\mathrm{H}_{3}$$), ($$\mathrm{H}_{4}$$), Theorem 7, and $$\sum_{i\in I}Q_{i}=Q$$, that

\begin{aligned} \sum_{i\in I}p_{i}&=\sum _{i\in I} \biggl( \sum_{j\in J}\lambda _{j} \int _{ ( q>0 ) } \frac{p}{q}\,dQ_{\pi _{j} ( i ) } \biggr) =\sum _{j \in J}\lambda _{j} \biggl( \sum _{i\in I} \int _{ ( q>0 ) }\frac{p}{q}\,dQ_{\pi _{j} ( i ) } \biggr) \\ &=\sum_{j\in J}\lambda _{j} \biggl( \sum _{i\in I} \int _{ ( q>0 ) }\frac{p}{q}\,dQ_{i} \biggr) = \int _{ ( q>0 ) }\frac{p}{q}\,dQ=P ( q>0 ) , \end{aligned}

and therefore $$( p_{i} ) _{i\in I}$$ and also represent a discrete probability distribution p. Since

$$0f \biggl( \frac{\widehat{p}}{0} \biggr) := \textstyle\begin{cases} 0=f^{\ast } ( 0 ) P ( q=0 ) ,& \widehat{p}=0, \\ \lim_{t\rightarrow 0+}tf ( \frac{\widehat{p}}{t} ) =f^{\ast } ( 0 ) P ( q=0 ) ,& \widehat{p}>0, \end{cases}$$

we can see that the second term in (13) can be considered as the f-divergence functional of p and q (if $$I=\mathbb{N}_{+}$$, then we can say that $$I_{f} ( \mathbf{p},\mathbf{q} )$$ is a generalized f-divergence functional). It is interesting that inequality (12) for “continuous” f-divergence can be refined by “discrete” f-divergence.

Not applicable.

## References

1. Ali, M.S., Silvey, D.: A general class of coefficients of divergence of one distribution from another. J. R. Stat. Soc. Ser. B 28, 131–140 (1966)

2. Brnetić, I., Khan, K.A., Pečarić, J.: Refinement of Jensen’s inequality with applications to cyclic mixed symmetric means and Cauchy means. J. Math. Inequal. 9(4), 1309–1321 (2015)

3. Csiszár, I.: Eine Informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität on Markoffschen Ketten. Publ. Math. Inst. Hung. Acad. Sci. Ser. A 8, 84–108 (1963)

4. Csiszár, I.: Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hung. 2, 299–318 (1967)

5. Dragomir, S.S.: Refining Jensen’s integral inequality for partitions of weights. Libertas Math. 37, 25–44 (2017)

6. Dragomir, S.S.: Refining Jensen’s integral inequality for divisions of measurable space. J. Math. Ext. 12(2), 87–106 (2018)

7. Dragomir, S.S., Khan, M.A., Abathun, A.: Refinement of the Jensen integral inequality. Open Math. 14, 221–228 (2016)

8. Fejér, L.: Über die Fourierreihen, II. Math. Naturwiss. Anz Ungar. Akad. Wiss. 24, 369–390 (1906) (in Hungarian)

9. Hadamard, J.: Étude sur les propriétés des fonctions entiéres en particulier d’une fonction considérée par Riemann. J. Math. Pures Appl. 58, 171–215 (1893)

10. Horváth, L.: A refinement of the integral form of Jensen’s inequality. J. Inequal. Appl. 2012, Article ID 178 (2012)

11. Horváth, L.: New refinements of the discrete Jensen’s inequality generated by finite or infinite permutations. Aequ. Math. https://doi.org/10.1007/s00010-019-00696-z

12. Horváth, L., Khan, K.A., Pečarić, J.: Combinatorial Improvements of Jensen’s Inequality. Element, Zagreb (2014)

13. Horváth, L., Khan, K.A., Pečarić, J.: Cyclic refinements of the discrete and integral form of Jensen’s inequality with applications. Analysis 36(4), 253–262 (2016)

14. Horváth, L., Pečarić, D., Pečarić, J.: A refinement and an exact equality condition for the basic inequality of f-divergences. Filomat 32(12), 4263–4273 (2018)

15. Liese, F., Vajda, I.: On divergences and informations in statistics and information theory. IEEE Trans. Inf. Theory 52, 4394–4412 (2006)

16. Niculescu, C., Persson, L.E.: Convex Functions and Their Applications. A Contemporary Approach. Springer, Berlin (2006)

17. Perlman, M.D.: Jensen’s inequality for a convex vector-valued function on an infinite-dimensional space. J. Multivar. Anal. 4, 52–65 (1974)

18. Rockafellar, T.: Convex Analysis. Princeton University Press, Princeton (1970)

19. Rooin, J.: A refinement of Jensen’s inequality. J. Inequal. Pure Appl. Math. 6(2), Article ID 38 (2005)

20. Rudin, W.: Real and Complex Analysis, 3rd edn. McGraw-Hill, New York (1987)

21. Tseng, K.-L., Hwang, S.-R., Dragomir, S.S.: Refinements of Fejér’s inequality for convex functions. Period. Math. Hung. 65, 17–28 (2012)

22. Vajda, I.: Theory of Statistical Inference and Information. Kluwer Academic, Boston (1989)

Not applicable.

## Funding

The research of the author was supported by Hungarian Scientific Research Fund Grant No. K101217 and Széchenyi 2020 under EFOP-3.6.1-16-2016-00015.

## Author information

Authors

### Contributions

The author carried out the results and read and approved the current version of the manuscript.

### Corresponding author

Correspondence to László Horváth.

## Ethics declarations

### Competing interests

The author declares that they have no competing interests.

## Rights and permissions 