Zipf–Mandelbrot law, f-divergences and the Jensen-type interpolating inequalities
- Neda Lovričević^{1}Email authorView ORCID ID profile,
- Ðilda Pečarić^{2} and
- Josip Pečarić^{3, 4}
https://doi.org/10.1186/s13660-018-1625-y
© The Author(s) 2018
Received: 19 November 2017
Accepted: 29 January 2018
Published: 8 February 2018
Abstract
Motivated by the method of interpolating inequalities that makes use of the improved Jensen-type inequalities, in this paper we integrate this approach with the well known Zipf–Mandelbrot law applied to various types of f-divergences and distances, such are Kullback–Leibler divergence, Hellinger distance, Bhattacharyya distance (via coefficient), \(\chi^{2}\)-divergence, total variation distance and triangular discrimination. Addressing these applications, we firstly deduce general results of the type for the Csiszár divergence functional from which the listed divergences originate. When presenting the analyzed inequalities for the Zipf–Mandelbrot law, we accentuate its special form, the Zipf law with its specific role in linguistics. We introduce this aspect through the Zipfian word distribution associated to the English and Russian languages, using the obtained bounds for the Kullback–Leibler divergence.
Keywords
MSC
1 Introduction
Let us start with the notion of f-divergences which measure the distance between two probability distributions by making an average value, which is weighted by a specific function, of the odds ratio given by two probability distributions. Among the existing f-divergences introduced in the process of finding the adequate distance between two probability distributions, let us point out the Csiszár f-divergence [1, 2], some special cases of which are the Kullback–Leibler divergence (see [3, 4]), the Hellinger distance, the Bhattacharyya distance, the total variation distance, the triangular discrimination (see [5, 6]). The notion of ‘distance’ mainly appears as somewhat stronger than ‘divergence’ since it suggests the properties of symmetry and the triangle inequality. Considering a great number of fields in which probability theory cooperates, it is no wonder that divergences between probability distributions have many specific applications in a variety of those fields.
In recent investigations of relation (2) and its numerous consequences, it appeared as a fruitful field for many significant results. We accentuate those which deal with this relation in view of superadditivity and monotonicity of the Jensen-type functionals, in [8, 9] or [10], obtained via [11] and suitably summarized in the monograph [12]. In the following part we are going to make use of relation (2) while presenting certain bounds for a selected spectrum of f-divergences that originate from the Csiszár divergence functional.
All of the results thus obtained concerning f-divergences are going to be observed here in the context of the Zipf–Mandelbrot law and then specified for the Zipf law.
For finite N and for \(t=0\) the Zipf–Mandelbrot law is simply called the Zipf law. (In particular, if we observe the infinite N and \(t=0\) we actually have the Zeta distribution.)
The rest of the paper is organized as follows. In Section 1 we define the Csiszár functional and various f-divergences for which we give in Section 3 the results based on relation (2). These are further examined in Section 4 in the light of the Zipf–Mandelbrot law and the Zipf law. For the latter we give in Section 5 a specific application in linguistics, concerning the Kullback–Leibler divergence.
2 Preliminaries
The general aspect of the Csiszár divergence functional (6) can be interpreted as a series of well-known entropies, divergencies and distances, for special choices of the kernel f. In the sequel we present some of the most frequent among them.
Entropies quantify the diversity, uncertainty and randomness of a system. The concept of the Rényi entropy was introduced by [22] and has been of a great importance in statistics, ecology, theoretical computer science etc.
Remark 1
Although it is common to take the logarithm function with the base 2, it will not be essential in the sequel. Moreover, we are going to analyze the results including the logarithm function for different (positive) bases, namely, for those greater than 1 as well as for those that are less than 1.
Among various divergences and considering the properties of symmetry and triangular inequality which some of them possess, we can also define certain distances between two probability distributions.
The Hellinger distance is a metric and is often used in its squared form, i.e. as \(h^{2}(\mathbf{p},\mathbf{q}):=\frac {1}{2}\sum_{i=1}^{n} (\sqrt{p_{i}}-\sqrt{q_{i}} )^{2} \).
More detailed analyses of the mentioned divergences as well as their wider spectrum one can find e.g. in [5, 6, 24].
3 Basic relations for f-divergences
In order to deduce the relations from relation (2) for the f-divergences described in the introductory part, we start with the general result for bounds obtained for the Csiszár functional (6) observed under more general conditions as \(\tilde{D}_{f}(\mathbf{p},\mathbf{q})\).
Theorem 1
Let \(I\subseteq\mathbb{R}\) be an interval. Suppose \(\mathbf {p}=(p_{1},\ldots,p_{n})\) is an n-tuple of real numbers with \(P_{n}=\sum_{i=1}^{n}p_{i}\) and \(\mathbf{q}=(q_{1},\ldots,q_{n})\) is an n-tuple of nonnegative real numbers with \(Q_{n}=\sum_{i=1}^{n}q_{i}\), such that \(\frac{p_{i}}{q_{i}}\in I\), \(i=1,\ldots,n\).
If f is a concave function, then the inequality signs are reversed.
If \(t\mapsto tf(t)\) is a concave function, then the inequality signs are reversed.
Proof
If we observe a convex function f and replace \(p_{i}\) by \(q_{i}\) as well as \(x_{i}\) by \(\frac{p_{i}}{q_{i}}\) in relation (2), we get (17).
If we then observe the function \(t\mapsto tf(t)\) as a convex function and replace \(p_{i}\) by \(q_{i}\) and \(x_{i}\) by \(\frac{p_{i}}{q_{i}}\) we get (18). □
The following corollary precedes the related result for the Kullback–Leibler divergence (10). Recall that (10) can be interpreted as a special case of the functional (6).
Corollary 1
If the logarithm base is less than 1, then the inequality signs are reversed.
Proof
It follows from Theorem 1 as a special case of inequalities (18), for the function \(t\mapsto t\log t\), which is convex when the logarithm base is greater than 1 (and concave when the base is less than 1). □
If we additionally specify the n-tuples p and q as in the sequel, we provide the bounds for the Kullback–Leibler divergence.
Remark 2
If the logarithm base is less than 1, then the inequality signs are reversed.
In other words, we obtained the corresponding bounds for the Kullback–Leibler divergence (10).
Remark 3
The Kullback–Leibler divergence is sometimes used in its reversed form \(\mathrm{KL}(\mathbf{q},\mathbf{p})\). A similar type of bounds can be obtained when observing the reversed Kullback–Leibler divergence making use of the kernel function \(f(t)=-\log t\), its convexity and concavity related to the observed logarithm base (greater than 1 or less than 1, respectively), and following the analogous procedure described in Corollary 1 and Remark 2.
It is natural to observe in a similar fashion the other divergences (distances) described in Section 1: the Hellinger distance, the Bhattacharyya coefficient, the chi-square divergence, the total variation distance and the triangular discrimination.
Corollary 2
Proof
It follows from Theorem 1 as a special case of inequalities (17), for the convex function \(t\rightarrow\frac{1}{2} (\sqrt{t}-1 )^{2}\). □
Remark 4
In other words, we obtained the corresponding bounds for the (squared) Hellinger distance \(h^{2}(\mathbf {p},\mathbf{q})\).
Corollary 3
Proof
It follows from Theorem 1 as a special case of inequalities (17), for the convex function \(f(t)=-\sqrt{t}\). □
Remark 5
In other words, we obtained the corresponding bounds for the Bhattacharyya coefficient \(B(\mathbf{p},\mathbf {q})\).
Corollary 4
Proof
It follows from Theorem 1 as a special case of inequalities (17), for the convex function \(f(t)=(t-1)^{2}\). □
Remark 6
In other words, we obtained the corresponding bounds for the chi-square divergence \(\chi^{2}(\mathbf {p},\mathbf{q})\).
Corollary 5
Proof
It follows from Theorem 1 as a special case of inequalities (17), for the convex function \(f(t)= \vert t-1 \vert \). □
Remark 7
In other words, we obtained the corresponding bounds for the total variation distance \(V(\mathbf{p},\mathbf {q})\).
Corollary 6
Proof
It follows from Theorem 1 as a special case of inequalities (17), for the convex function \(f(t)=\frac {(t-1)^{2}}{t+1}\). □
Remark 8
In other words, we obtained the corresponding bounds for the triangular discrimination \(\Delta(\mathbf {p},\mathbf{q})\).
4 On f-divergences for the Zipf–Mandelbrot law
In this section we are going to derive the results from the previous section for the Zipf–Mandelbrot law (3). Namely, if we put \(q_{i}=f(i;N,s,t)\) in (3) as its probability mass function, we can observe obtained results in the light of the Zipf–Mandelbrot law.
Corollary 7
Let \(\mathbf{p}=(p_{1},\ldots,p_{N})\) be an N-tuple of real numbers with \(P_{N}=\sum_{i=1}^{N}p_{i}\). Suppose \(I\subseteq\mathbb{R}\) is an interval, \(N \in\mathbb{N}\) and \(s_{2}> 0\), \(t_{2}\geq0\) are such that \(p_{i}(i+t_{2})^{s_{2}}H_{N,s_{2},t_{2}}\in I\), \(i=1,\ldots ,N\).
Proof
It leans on the proof of Theorem 1 with its described substitutions, where we insert for \(q_{i}\) the expression \(\frac{1}{(i+t_{2})^{s_{2}}H_{N,s_{2},t_{2}}}, i=1,\ldots,N\), which defines the Zipf–Mandelbrot law (3), with \(Q_{N}=1\). Since the minimal value for \(q_{i}\) is \(\min\{q_{i}\}= \frac {1}{(N+t_{2})^{s_{2}}H_{N,s_{2},t_{2}}}\) and its maximal value is \(\max \{q_{i}\}= \frac{1}{(1+t_{2})^{s_{2}}H_{N,s_{2},t_{2}}}\), inequalities (35) and (36) follow for the convex functions f and \(t\mapsto tf(t)\), respectively. They change their signs in the case of concavity as a consequence of the Jensen inequality implicitly included. □
If we have both p and q defined via the Zipf–Mandelbrot law, then the following corollary plays a role.
Corollary 8
Let \(I\subseteq\mathbb{R}\) be an interval and suppose \(N \in\mathbb {N}\), \(s_{1}, s_{2}> 0\), \(t_{1},t_{2}\geq0\) are such that \(\frac { (i+t_{2} )^{s_{2}}H_{N,s_{2},t_{2}}}{ (i+t_{1} )^{s_{1}}H_{N,s_{1},t_{1}}}\in I\), \(i=1,\ldots,N\).
If f is a concave function, then the inequality signs are reversed.
Proof
Since the corollary is a special case of the previous one, its proof is provided by inserting equation (3), which defines the Zipf–Mandelbrot law instead of \(p_{i}\), \(i=1,\ldots,N\), as was already done for \(q_{i}\). That is, \(p_{i}= \frac {1}{(i+t_{1})^{s_{1}}H_{N,s_{1},t_{1}}}, i=1,\ldots,N\), where \(P_{N}=1\). The rest of the proof follows along the same lines as in Corollary 7, so inequalities (37) and (38) follow for convex functions f and \(t\mapsto tf(t)\), respectively. They change their signs in the case of concavity as a consequence of the Jensen inequality implicitly included. □
Finally, if both p and q are defined via the Zipf law (5), then the following statements hold.
Corollary 9
Let \(I\subseteq\mathbb{R}\) be an interval and suppose \(N \in\mathbb {N}\), \(s_{1}, s_{2}> 0\) are such that \(i^{s_{2}-s_{1}}\frac {H_{N,s_{2}}}{H_{N,s_{1}}}\in I\), \(i=1,\ldots,N\).
If f is a concave function, then the inequality signs are reversed.
Proof
Inequalities (39) and (40) are proved analogously to Corollary 8 if we observe the probability mass functions \(p_{i}\) and \(q_{i}\) as Zipf laws defined by (5). □
Let us provide the accompanied results of this type for some special cases of f-divergences, starting with the Kullback–Leibler divergence (10). Again, we firstly observe the more general case in which only one of two N-tuples p and q is defined via the Zipf–Mandelbrot law (3).
Corollary 10
Let \(\mathbf{p}=(p_{1},\ldots,p_{N})\) be an N-tuple of nonnegative real numbers with \(P_{N}=\sum_{i=1}^{N}p_{i}\), \(N \in\mathbb{N}\) and \(s_{2}> 0\), \(t_{2}\geq0\).
Proof
It follows from Corollary 7 as a special case of inequalities (36), for the function \(t\mapsto t\log t\), which is convex when the logarithm base is greater than 1. It can also be derived from Corollary 1 and Remark 2 in the context of the Zipf–Mandelbrot law. □
When both p and q are defined via the Zipf–Mandelbrot law (3) or via the Zipf law (5), the following statements hold.
Corollary 11
Let \(N \in\mathbb{N}\) and \(s_{1}, s_{2}> 0\), \(t_{1},t_{2}\geq0\).
Proof
Inequalities (42) follow from Corollary 8 as a special case of inequalities (38), for the function \(t\mapsto t\log t\), which is convex when the logarithm base is greater than 1.
Similarly, inequalities (43) follow from Corollary 9 as a special case of inequalities (40). □
The following corollaries deal with the Hellinger distance (11) considering one or two N-tuples defined via the Zipf–Mandelbrot law or the Zipf law, as its special case.
Corollary 12
Let \(\mathbf{p}=(p_{1},\ldots,p_{N})\) be an N-tuple of nonnegative real numbers with \(P_{N}=\sum_{i=1}^{N}p_{i}\), \(N \in\mathbb{N}\) and \(s_{2}> 0\), \(t_{2}\geq0\).
Proof
It follows from Corollary 7 as a special case of inequalities (35), for the convex function \(t\mapsto\frac{1}{2} (\sqrt{t}-1 )^{2}\). It can also be derived from Corollary 2 and Remark 4 in the context of the Zipf–Mandelbrot law. □
When both p and q are defined via the Zipf–Mandelbrot law (3) or via the Zipf law (5), the following statements hold.
Corollary 13
Let \(N \in\mathbb{N}\) and \(s_{1}, s_{2}> 0\), \(t_{1},t_{2}\geq0\).
Proof
Inequalities (45) follow from Corollary 8 as a special case of inequalities (37), for the convex function \(t\mapsto\frac {1}{2} (\sqrt{t}-1 )^{2}\).
Similarly, inequalities (46) follow from Corollary 9 as a special case of inequalities (39). □
In the sequel we provide the results of this type for the Bhattacharyya coefficient (13), starting with one N-tuple defined via the Zipf–Mandelbrot law and proceeding with both such N-tuples, as well as with the Zipf law, as its special case.
Corollary 14
Let \(\mathbf{p}=(p_{1},\ldots,p_{N})\) be an N-tuple of nonnegative real numbers with \(P_{N}=\sum_{i=1}^{N}p_{i}\), \(N \in\mathbb{N}\) and \(s_{2}> 0\), \(t_{2}\geq0\).
Proof
It follows from Corollary 7 as a special case of inequalities (35), for the convex function \(t\mapsto-\sqrt{t}\). It can also be derived from Corollary 3 and Remark 5 in the context of the Zipf–Mandelbrot law. □
Corollary 15
Let \(N \in\mathbb{N}\) and \(s_{1}, s_{2}> 0\), \(t_{1},t_{2}\geq0\).
Proof
Inequalities (48) follow from Corollary 8 as a special case of inequalities (37), for the convex function \(t\mapsto-\sqrt{t}\).
Similarly, inequalities (49) follow from Corollary 9 as a special case of inequalities (39). □
In the same manner we proceed with analogous results for the chi-square divergence (14) and the total variation distance (15).
Corollary 16
Let \(\mathbf{p}=(p_{1},\ldots,p_{N})\) be an N-tuple of real numbers with \(P_{N}=\sum_{i=1}^{N}p_{i}\), \(N \in\mathbb{N}\) and \(s_{2}> 0\), \(t_{2}\geq0\).
Proof
It follows from Corollary 7 as a special case of inequalities (35), for the convex function \(t\mapsto(t-1)^{2}\). It can also be derived from Corollary 4 and Remark 6 in the context of the Zipf–Mandelbrot law. □
Corollary 17
Let \(N \in\mathbb{N}\) and \(s_{1}, s_{2}> 0\), \(t_{1},t_{2}\geq0\).
Proof
Inequalities (51) follow from Corollary 8 as a special case of inequalities (37), for the convex function \(t\mapsto(t-1)^{2}\).
Similarly, inequalities (52) follow from Corollary 9 as a special case of inequalities (39). □
Corollary 18
Let \(\mathbf{p}=(p_{1},\ldots,p_{N})\) be an N-tuple of real numbers with \(P_{N}=\sum_{i=1}^{N}p_{i}\), \(N \in\mathbb{N}\) and \(s_{2}> 0\), \(t_{2}\geq0\).
Proof
It follows from Corollary 7 as a special case of inequalities (35), for the convex function \(t\mapsto \vert t-1 \vert \). It can also be derived from Corollary 5 and Remark 7 in the context of the Zipf–Mandelbrot law. □
Corollary 19
Let \(N \in\mathbb{N}\) and \(s_{1}, s_{2}> 0\), \(t_{1},t_{2}\geq0\).
Proof
Inequalities (54) follow from Corollary 8 as a special case of inequalities (37), for the convex function \(t\mapsto \vert t-1 \vert \).
Similarly, inequalities (55) follow from Corollary 9 as a special case of inequalities (39). □
In order to conclude this section providing the Jensen-inequality related results for the f-divergences based on the Zipf–Mandelbrot law (or the Zipf law), for the triangular discrimination (16) we give only the latter one: the bounds obtained in the case of both N-tuples observed via the Zipf law.
Corollary 20
Let \(N \in\mathbb{N}\) and \(s_{1}, s_{2}> 0\).
5 An application of the Zipf law
In the final section we are going to show how the experimental character of the Zipf law can be interpreted through the bounds (43) obtained for the Kullback–Leibler divergence.
Namely, the coefficients \(s_{1}\) and \(s_{2}\) from the Zipf law were analyzed by Gelbukh and Sidorov in [25] as assigned to the Russian and English languages. They calculated the mentioned coefficients and their difference for each of the 39 literature texts in both languages, with more than 10,000 running words inside of each of them. In the process they obtained the average of \(s_{1}=0.892869\) for the Russian and \(s_{2}=0.973863\) for the English language.
In this context, with the described experimental values of \(s_{1}\) and \(s_{2}\) involved, the bounds for the Kullback–Leibler divergence in (43) assume the following form which thus depends only on the parameter N.
Example 1
Proof
It follows directly from (43) when inserting the experimental values of \(s_{1}\) and \(s_{2}\). □
6 Conclusions
In this paper we investigated f-divergences that originate from the Csiszár functional and their link to the Jensen inequality with a specific type of the Jensen-type interpolating inequalities. By means of these inequalities we derived new bounds for f-divergences in general via the Csiszár functional and in particular for the Kullback–Leibler divergence, Hellinger distance, Bhattacharyya distance (coefficient), \(\chi^{2}\)-divergence, total variation distance and triangular discrimination. Consequently, we deduced analogous results in the light of the well-known Zipf–Mandelbrot law, with the adequate probability mass functions and the adjusted form of the Csiszár functional. The Zipf–Mandelbrot law was analyzed as a more general form of the Zipf law, for which we also gave the corresponding results and an application in linguistics in order to accentuate its experimental character. Thus this paper includes three important and widely investigated issues: the Jensen inequality, the divergences (for probability distributions) and the Zipf–Mandelbrot law with its less general, but not less important form, the Zipf law. In this way, the paper can be of an interest for mathematicians who investigate any of these fields with an accent put on mathematical inequalities, as well as for the interdisciplinary fields (e.g. linguistics was involved in this case).
Declarations
Acknowledgements
This publication was supported by the Ministry of Education and Science of the Russian Federation (Agreement number No.02.a03.21.0008) and the University of Split by means of the Grant of Research Funding (number 4-1212).
Funding
The funding for this research and the costs of publication was a lump sum granted by University of Split Research Funding.
Authors’ contributions
All authors contributed equally. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Csiszár, I.: Information-type measures of difference of probability functions and indirect observations. Studia Sci. Math. Hung. 2, 299–318 (1967) MathSciNetMATHGoogle Scholar
- Csiszár, I.: Information measures: a critical survey. In: Trans. 7th Prague Conf. on Info. Th. Statist. Decis. Funct., Random Processes and 8th European Meeting of Statist. B, pp. 73–86 (1978) Google Scholar
- Kullback, S.: Information Theory and Statistics. Wiley, New York (1959) MATHGoogle Scholar
- Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951) MathSciNetView ArticleMATHGoogle Scholar
- Dragomir, S.S.: Some inequalities for the Csiszár Φ-divergence, pp. 1–13. RGMIA (2001) Google Scholar
- Taneja, I.J.: Bounds on triangular discrimination, harmonic mean and symmetric Chi-square divergences. arXiv:math/0505238
- Mitrinović, D.S., Pečarić, J., Fink, A.M.: Classical and New Inequalities in Analysis. Kluwer Academic, Dordrecht (1993) View ArticleMATHGoogle Scholar
- Krnić, M., Lovričević, N., Pečarić, J.: On McShane’s functional’s properties and its applications. Period. Math. Hung. 66(2), 159–180 (2013) View ArticleMATHGoogle Scholar
- Krnić, M., Lovričević, N., Pečarić, J.: Superadditivity of the Levinson functional and applications. Period. Math. Hung. 71(2), 166–178 (2015) MathSciNetView ArticleMATHGoogle Scholar
- Krnić, M., Lovričević, N., Pečarić, J.: Jessen’s functional, its properties and applications. An. Ştiinţ. Univ. ‘Ovidius’ Constanţa, Ser. Mat. 20(1), 225–248 (2012) MathSciNetMATHGoogle Scholar
- Dragomir, S.S., Pečarić, J., Persson, L.E.: Properties of some functionals related to Jensen’s inequality. Acta Math. Hung. 70(1–2), 129–143 (1996) MathSciNetView ArticleMATHGoogle Scholar
- Krnić, M., Lovričević, N., Pečarić, J., Perić, J.: Superadditivity and Monotonicity of the Jensen-Type Functionals. Element (2015) MATHGoogle Scholar
- Manin, D.Y.: Mandelbrot’s model for Zipf’s law: can Mandelbrot’s model explain Zipf’s law for language? J. Quant. Linguist. 16(3), 274–285 (2009) MathSciNetView ArticleGoogle Scholar
- Mandelbrot, B.: An informational theory of the statistical structure of language. In: Jackson, W. (ed.) Communication Theory, pp. 486–502. Academic Press, New York (1953) Google Scholar
- Mandelbrot, B.: Information Theory and Psycholinguistics. Scientific Psychology: Principles and Approaches. Basic Books, New York (1965) Google Scholar
- Montemurro, M.A.: Beyond the Zipf–Mandelbrot law in quantitative linguistics. arXiv:cond-mat/0104066
- Egghe, L., Rousseau, R.: Introduction to Informetrics. Quantitative Methods in Library, Documentation and Information Science. Elsevier, New York (1990) Google Scholar
- Silagadze, Z.K.: Citations and the Zipf–Mandelbrot law. Complex Syst. 11, 487–499 (1997) MATHGoogle Scholar
- Mouillot, D., Lepretre, A.: Introduction of relative abundance distribution (RAD) indices, estimated from the rank-frequency diagrams (RFD), to assess changes in community diversity. Environ. Monit. Assess. 63(2), 279–295 (2000) View ArticleGoogle Scholar
- Manaris, B., Vaughan, D., Wagner, C.S., Romero, J., Davis, R.B.: Evolutionary music and the Zipf–Mandelbrot law: developing fitness functions for pleasant music. In: Proceedings of 1st European Workshop on Evolutionary Music and Art (EvoMUSART2003), Essex, pp. 522–534 (2003) Google Scholar
- Horváth, L., Pečarić, Ð., Pečarić, J.: Estimations of f- and Rènyi divergences by using a cyclic refinement of the Jensen’s inequality. https://doi.org/10.1007/s40840-017-0526-4
- Renyi, A.: On measures of entropy and information. In: Proc. Fourth Berkeley Symp. Math. Statist. Prob., San Diego, vol. 1, pp. 522–534 (1992) Google Scholar
- Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948) MathSciNetView ArticleMATHGoogle Scholar
- van Erven, T., Harremoës, P.: Rényi divergence and Kullback–Leibler divergence. J. Latex Class Files 6(1), 1–24 (2007) MATHGoogle Scholar
- Gelbukh, A., Sidorov, G.: Zipf and heaps laws’ coefficients depend on language. In: Proceedings of Conference on Intelligent Text Processing and Computational Linguistics. Mexico City (2001) View ArticleGoogle Scholar