- Research
- Open access
- Published:
On the Berry-Esséen bound of frequency polygons for ϕ-mixing samples
Journal of Inequalities and Applications volume 2017, Article number: 65 (2017)
Abstract
Under some mild assumptions, the Berry-Esséen bound of frequency polygons for ϕ-mixing samples is presented. By the bound derived, we obtain the corresponding convergence rate of uniformly asymptotic normality, which is nearly \(O(n^{-1/6})\) under the given conditions.
1 Introduction
At first, we introduce briefly the conception of ϕ-mixing sequence. Set
where \(\mathcal{{F}}_{1}^{k}=\sigma(X_{j}, 1\leq j\leq k)\) and \(\mathcal{{F}}_{k+n}^{\infty}=\sigma(X_{j}, j> k+n)\). The sequence \(\{X_{i}, i\geq1\}\) is called ϕ-mixing if \(\lim_{n\to\infty} \phi(n)=0\). The ϕ-mixing dependence was introduced by Dobrushin [1], and many applications have been found. See, for example, Dobrushin [1], Utev [2], Yang [3], Yang and Hu [4] and so on.
In what follows, let us introduce the conception of frequency polygon. Suppose that X is a random variable with a density function \(f(x)\), and let \(X_{1}, X_{2}, \ldots, X_{n}\) be the sample drawn from the population X. Consider a partition \(\cdots< x_{-2}< x_{-1}< x_{0}< x _{1}< x_{2}<\cdots\) of the real line into equal intervals \(I_{k}=[(k-1)b _{n}, kb_{n})\) of length \(b_{n}\), where \(b_{n}\) is the bin width. For given \(x\in R\), there exists \(k_{0}\) such that \((k_{0}-\frac{1}{2})b _{n}\leq x<(k_{0}+\frac{1}{2})b_{n}\). Consider the two adjacent histogram bins \(I_{k_{0}}=[(k_{0}-1)b_{n},k_{0}b_{n})\) and \(I_{k_{1}}=[k _{0}b_{n},k_{1}b_{n})\), where \(k_{1}=k_{0}+1\). Define \(v_{k_{0}}=\sum^{n}_{i=1}I((k_{0}-1)b_{n}\leq X_{i}< k_{0}b_{n})\) and \(v_{k_{1}}=\sum^{n}_{i=1}I(k_{0}b_{n}\leq X_{i}< k_{1}b_{n})\), which are the number of observations falling in the intervals mentioned above, respectively. The values of the histogram in these previous bins can be denoted by \(f_{k_{0}}=v_{k_{0}}n^{-1}b_{n}^{-1}\) and \(f_{k_{1}}=v_{k_{1}}n^{-1}b _{n}^{-1}\). Then the frequency polygon \(\widehat{f}(x)\) can be defined as
for \(x\in[ ( k_{0}-\frac{1}{2} ) b_{n}, ( k_{0}+ \frac{1}{2} ) b_{n})\).
As pointed out by Scott [5], the frequency polygon has convergence rates similar to those of kernel density estimators and greater than the rate for a histogram. As for computation, the computational effort of the frequency polygon is equivalent to the one of the histogram. For large bivariate data sets, the computational simplicity of the frequency polygon and the ease of determining exact equiprobable contours may outweigh the increased accuracy of a kernel density estimator. Bivariate contour plots based on millions of observations are increasingly required in applications including high-energy physics simulation experiments, cell sorters and geographical data representation. Moreover, such data are usually collected in a binned form. Therefore, the frequency polygon can be a useful tool for examination and presentation of data. Since the frequency polygon has the advantages mentioned above, it attracts the attention of some scholars, and they have derived some results. For the explicit results obtained, one can refer to the references listed in Yang and Liang [6] and Xing et al. [7], which gave the strong consistency of frequency polygons. Among the obtained results, the study on asymptotic normality can be found in Carbon et al. [8]. The relevant Berry-Esséen bound for ϕ-mixing samples has not been seen. This motivates us to investigate the Berry-Esséen bound of frequency polygon under ϕ-mixing samples. Under the given assumptions, we give the corresponding Berry-Esséen bound. Furthermore, by the obtained Berry-Esséen bound, the relevant convergence rate of uniformly asymptotic normality is also derived, which is nearly \(O(n^{-1/6})\) under the given conditions.
Throughout this article, we always suppose that C denotes a positive constant which only depends on some given numbers and may vary from one place to another. The organization of this paper is as follows. Section 2 contains the main result obtained, Section 3 contains the corresponding proof.
2 Main result
For the convenience of formulation of our main result, we need the following assumptions.
- (A1):
-
The density function \(f(x)\) is continuous in \(x\in R\) and \(f(x)\leq M\) for \(x\in R\) and some \(M>0\).
- (A2):
-
The random sample \(\{X_{i}, 1\leq i\leq n\}\) is stationary, identically distributed and ϕ-mixing with \(\phi(n)=O(n^{- \rho})\), where \(\rho>\frac{7-58\epsilon}{36\epsilon}\) with \(0<\epsilon<\frac{57}{1650}\).
- (A3):
-
The bin width \(b_{n}\) satisfies \(b_{n}\rightarrow0\) and \(nb_{n}\rightarrow\infty\).
- (A4):
-
Define \(\sigma^{2}_{n}(x):=\operatorname{Var}(\widehat{f}(x))\), and there exist \(\delta>0\), two positive integers \(p:=p_{n}\) and \(q:=q_{n}\) such that
$$p+q\leq n,\qquad qp^{-1}\leq C< \infty $$and
$$\gamma_{1n}\rightarrow0,\qquad \gamma_{2n}\rightarrow0,\qquad \gamma_{3n} \rightarrow0, \qquad u_{1}(q)\rightarrow0,\qquad u_{2}(q)\rightarrow0, $$as \(n\rightarrow\infty\), where
$$\begin{aligned}& \gamma_{1n}:=q(p+q)^{-1}\bigl(nb_{n} \sigma^{2}_{n}(x)\bigr)^{-1},\qquad \gamma_{2n}:=p \bigl(nb _{n}^{1/2}\sigma_{n}(x)\bigr)^{-2}, \\& \gamma_{3n}:=p^{\delta/2}(nb_{n})^{-(1+\delta)}\bigl( \sigma_{n}(x)\bigr)^{-(2+ \delta)},\qquad u_{1}(q):=q^{-\rho/2+1} \bigl(nb_{n}^{3/2}\sigma_{n}^{2}(x) \bigr)^{-1} \end{aligned}$$and
$$u_{2}(q):=\bigl(q^{\rho}pb_{n}\sigma_{n}^{2}(x) \bigr)^{-1/2}. $$
Based on the above assumptions, our main result can be given as follows.
Theorem 2.1
Suppose that assumptions (A1)-(A4) are satisfied. Then we have
where \(F_{n}(u)=P(S_{n}< u)\), \(S_{n}=\sigma_{n}(x)^{-1}\{\widehat{f}(x)-E \widehat{f}(x)\}\) and \(\Phi(u)\) is the distribution function of the standard normal random variable.
By Theorem 2.1, the following corollary can be obtained directly.
Corollary 2.1
Let the conditions of Theorem 2.1 be satisfied, and let \(p=[n^{\tau}]\), \(q=[n^{2\tau-1}]\), \(\delta=10/9\), \(b_{n}=O(n^{-1/5})\) and \(\sigma_{n}(x)=O(n^{-2/5})\) in (2.1), where \(\tau=1/2+3\epsilon\). Then
Remark 2.1
By (2.2), the convergence rate is nearly \(O(n^{-1/6})\) if \(\epsilon\rightarrow0^{+}\), as desired. Correspondingly, Yang and Hu [4] gave also the Berry-Esséen bounds of kernel density estimator under ϕ-mixing samples and obtained the relevant rate of convergence \(O(n^{-1/6}\log n \log \log n)\). Obviously, the two convergence rates are similar.
3 Proof
Before proving Theorem 2.1, we give some denotations used later. Let
and let p, q as in assumption (A4), \(v:=v_{n}=[n/(p+q)]\),
and
Then
Next, some lemmas are given, which will be applied later.
Lemma 3.1
Yang [3]
Assume that \(\{X_{i}, i\geq1\}\) is a sequence of ϕ-mixing random variables satisfying \(EX_{i}=0\), \(E\vert X_{i}\vert ^{q}<\infty\) for \(q\geq2\) and \(i=1,2,\ldots \) and \(\sum_{i=1}^{\infty}\phi^{1/2}(i)<\infty\). Then there exists a positive constant C such that
for \(n\geq1\).
Lemma 3.2
Under the conditions of Theorem 2.1, we have
and
Proof
By assumption (A1), it follows that \(EZ_{n,i}^{2}=E\vert (nb_{n}\sigma_{n}(x))^{-1}\eta_{i}(x)\vert ^{2}\leq C\frac{1}{n^{2} b_{n} \sigma_{n}^{2} (x)}\). Hence, in terms of Lemma 3.1, we obtain that
and that
Therefore, (3.7) holds. By Markov’s inequality and (3.9), (3.8) is obtained directly. The proof is complete. □
Lemma 3.3
For any integer k, there exist \(\zeta_{k_{s}} \in I_{k_{s}}\) (\(s=0,1\)) such that
and
Proof
By Theorem 5.1 in Roussas and Ioannides [9] and the proof of Corollary 2.1 in Carbon et al. [10], we can get the results directly. The details are omitted here. □
Lemma 3.4
Under the conditions of Theorem 2.1, we have
where \(s_{n}=\sqrt{\sum_{m=1}^{v}\operatorname{Var}(y_{n,m}^{\prime})}\).
Proof
Set \(\Gamma_{n}=\sum_{1\leq i< j\leq v} \operatorname{Cov}(y_{n,i}^{\prime},y_{n,j}^{\prime})\). Obviously,
Since \(E(S_{n})^{2}=1\) and \(E(S_{n}^{\prime })^{2}=E[S_{n}-(S_{n}^{\prime}+S _{n}^{\prime\prime })]^{2} =ES_{n}^{2}-2E[S_{n}(S_{n}^{\prime}+S_{n}^{\prime\prime })]+E(S_{n} ^{\prime}+S_{n}^{\prime\prime })^{2}\), we have
On the other hand, by Lemma 3.3, it follows that
which together with (3.14) and (3.15) yields (3.13). The proof is completed. □
Assume that \(\{\eta_{nm}: m=1, \dots, v\}\) are independent random variables, and the distribution of \(\eta_{nm}\) is the same as that of \(y_{nm}^{\prime}\) for \(m=1, \dots, v\). Let \(T_{n}=\sum_{m=1}^{v}{\eta_{nm}}\), \(B_{n}=\sum_{m=1}^{v}\operatorname{Var}( \eta_{n,m})\) and \(\widetilde{F}_{n}(u)\), \(G_{n}(u)\) and \(\widetilde{G} _{n}(u)\) be the distribution functions of \(S_{n}^{\prime}\), \(T_{n}/\sqrt{B _{n}}\) and \(T_{n}\), respectively. Clearly,
Lemma 3.5
Under the conditions of Theorem 2.1, we have
Proof
By Lemma 3.1, \(\vert z_{n,i}\vert \leq\frac{e_{i,k_{0}}(x)+e _{i,k_{1}}(x)}{nb_{n}\sigma_{n}(x)}\) and assumption (A1), we have
which together with \(B_{n}=s_{n}^{2}\rightarrow1\) yielded by Lemma 3.4 implies that (3.17) holds by the Berry-Esséen theorem. □
Lemma 3.6
Let \(\{X_{i}, i\geq1\}\) be a sequence of ϕ-mixing random variables, and let \(\eta_{l}=\sum_{i=(l-1)(p+q)+1}^{(l-1)(p+q)+p}X_{i}\), where \(1\leq l\leq k\). If \(\frac{1}{r}+\frac{1}{s}=1\), where \(r>0\), \(s>0\), then
Proof
Obviously,
Noting that \(e^{ix}=\cos x+i\sin x\), \(\sin(x+y)=\sin x\cos y+\cos x \sin y\) and \(\cos(x+y)=\cos x\cos y-\sin x\sin y\), we get
From Theorem 5.1 in Roussas and Ioannides [9] and \(\vert \sin x\vert \leq \vert x\vert \), it follows that
Also, by \(\cos(2x)=1-2\sin^{2}x\), we get that
Similarly,
A combination of (3.19)-(3.23) yields that
Repeating the procedure above makes (3.18) hold. The proof is completed. □
Lemma 3.7
Under the conditions of Theorem 2.1, we have
Proof
Let \(\varphi(t)\) and \(\psi(t)\) be the characteristic functions of \(S_{n}^{\prime}\) and \(T_{n}\), respectively. Noting
we have
Also, from Lemma 3.1, it follows that
Then we have
which implies that
On the other hand, by \(\widetilde{G}_{n}(u)=G_{n}(u/s_{n})\) and Lemma 3.5, we get
Therefore,
which together with (3.26) implies
by the Esséen theorem and letting \(T=u_{2}^{-1/2}(q)\). The proof is complete. □
Lemma 3.8
Yang [11]
Suppose that \(\{ \zeta_{n}:n \geq1 \} \) and \(\{ \eta_{n}:n\geq1 \} \) are two random variable sequences, \(\{ \gamma_{n}:n\geq1 \} \) is a positive constant sequence and \(\gamma_{n}\to0\). If
then for any \(\varepsilon> 0\),
In what follows, we can give the proof of Theorem 2.1.
Proof of Theorem 2.1
It is easy to see that
By Lemmas 3.7, 3.5 and 3.4, we can obtain
and
which together with (3.5), (3.8) and (3.27) implies (2.1). The proof is completed. □
References
Dobrushin, RL: The central limit theorem for non-stationary Markov chain. Theory Probab. Appl. 1, 72-88 (1956)
Utev, SA: On the central limit theorem for ϕ-mixing arrays of random variables. Theory Probab. Appl. 35, 131-139 (1990)
Yang, S: Almost sure convergence of weighted sums of mixing sequences. J. Syst. Sci. Math. Sci. 15(3), 254-265 (1995) (in Chinese)
Yang, W, Hu, S: The Berry-Esséen bounds for kernel density estimator under dependent sample. J. Inequal. Appl. 2012, 287 (2012)
Scott, DW: Frequency polygons: theory and application. J. Am. Stat. Assoc. 80(390), 348-354 (1985)
Yang, S, Liang, D: Strong consistency of frequency polygon density estimator for ϕ-mixing sequence. J. Guangxi Norm. Univ. Nat. Sci. Ed. 30(3), 16-21 (2012) (in Chinese)
Xing, G, Yang, S, Liang, X: On the uniform consistency of frequency polygons for ψ-mixing samples. J. Korean Stat. Soc. 44, 179-186 (2015)
Carbon, M, Francq, C, Tran, LT: Asymptotic normality of frequency polygons for random fields. J. Stat. Plan. Inference 140(2), 502-514 (2010)
Roussas, GG, Ioannides, D: Moment inequalities for mixing sequences of random variables. Stoch. Anal. Appl. 5(1), 61-120 (1987)
Carbon, M, Garel, B, Tran, LT: Frequency polygons for weakly dependent processes. Stat. Probab. Lett. 33, 1-13 (1997)
Yang, S: Uniformly asymptotic normality of regression weighted estimator for negatively associated sample. Stat. Probab. Lett. 62, 101-110 (2003)
Acknowledgements
The authors are grateful to two anonymous referees for providing valuable comments which improved the first manuscript. This research is supported by the National Natural Science Foundation of China under grants No. 61561006 and No. 61573111 and the National Science Foundation of China (No. 11461009) and Guangxi Natural Science Foundation (no. 2015GXNSFDAA139003).
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
The authors contributed equally to this work. They both read and approved the final version of the manuscript.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Huang, Gj., Xing, G. On the Berry-Esséen bound of frequency polygons for ϕ-mixing samples. J Inequal Appl 2017, 65 (2017). https://doi.org/10.1186/s13660-017-1336-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-017-1336-9