- Research
- Open access
- Published:
The asymptotic normality of internal estimator for nonparametric regression
Journal of Inequalities and Applications volume 2018, Article number: 231 (2018)
Abstract
In this paper, we aim to study the asymptotic properties of internal estimator of nonparametric regression with independent and dependent data. Under some weak conditions, we present some results on asymptotic normality of the estimator. Our results extend some corresponding ones.
1 Introduction
In this paper, we consider the nonparametric regression model
where \((X_{i},Y_{i})\in R^{d}\times R\), \(d\geq1\), and \(U_{i}\) are random variables satisfying \(E(U_{i}|X_{i})=0\), \(1\leq i\leq n\), \(n\geq1\). So we have
Let \(K(x)\) be a kernel function. Define \(K_{h}(x)=h^{-d}K(x/h)\), where \(h=h_{n}\) is a sequence of positive bandwidths tending to zero as \(n\rightarrow\infty\). Kernel-type estimators of the regression function are widely used in various situations because of their flexibility and efficiency in the dependent and independent data. For the independent data, Nadaraya [1] and Watson [2] gave the most popular nonparametric estimator of the unknown function \(m(x)\) named the Nadaraya–Watson estimator \(\widehat{m}_{\mathrm{NW}}(x)\):
Jones et al. [3] considered various versions of kernel-type regression estimators such as the Nadaraya–Watson estimator (1.1) and the local linear estimator. They also investigated the internal estimator
for a known density \(f(\cdot)\). Here the factor \(\frac{1}{f(X_{i})}\) is internal to the summation, whereas the estimator \(\widehat{m}_{\mathrm{NW}}(x)\) has the factor \(\frac{1}{\widehat{f}(x)}=\frac{1}{n^{-1}\sum_{i=1}^{n}K_{h}(x-X_{i})}\) externally to the summation.
The internal estimator was first proposed by Mack and Müller [4]. Jones et al. [3] studied various kernel-type regression estimators, including the introduced internal estimator (1.2). Linton and Nielsen [5] introduced an integration method based on direct integration of initial pilot estimator (1.2). Linton and Jacho-Chávez [6] studied the other internal estimator
where \(\widehat{f}(X_{i})=\frac{1}{n}\sum_{j=1}^{n} L_{b}(X_{i}-X_{j})\) and \(L_{b}(\cdot)=L(\cdot/b)/b^{d}\). Here \(L(\cdot)\) is a kernel function, b is the bandwidth, and the density \(f(\cdot)\) is unknown. Under the independent data, Linton and Jacho-Chávez [6] obtained the asymptotic normality of the internal estimator \(\widetilde{m}_{n}(x)\) in (1.3). Shen and Xie [7] obtained the complete convergence and uniform complete convergence of internal estimator \(\widehat{m}_{n}(x)\) in (1.2) under the geometrical α-mixing (or strong mixing) data. Li et al. [8] weakened the conditions of Shen and Xie [7] and obtained the convergence rate and uniform convergence rate for the estimator \(\widehat{m}_{n}(x)\) in probability.
As far as we know, there are no results on asymptotic normality of the internal estimator \(\widehat{m}_{n}(x)\). Similarly to Linton and Jacho-Chávez [6], we investigate the asymptotic normality of the internal estimator \(\widehat{m}_{n}(x)\) with independent data and φ-mixing data, respectively. Asymptotic normality results are presented in Sect. 3.
Denote \(\mathcal{F}_{n}^{m}=\sigma(X_{i}, n\leq i\leq m)\) and define the coefficients
If \(\varphi(n)\downarrow0\) as \(n\rightarrow\infty\), then \(\{X_{n}\}_{n\geq1}\) is said to be a φ-mixing sequence.
The concept of φ-mixing is introduced by Dobrushin [9], and many properties of φ-mixing are presented in Chap. 4 of Billingsley [10]. If the coefficient of the process is geometrically decreasing, then the autoregressive moving average (ARMA) process can construct a geometric φ-mixing sequence. Györfi et al. [11, 12] gave more examples and applications to nonparametric estimation. We can also refer to Fan and Yao [13] and Bosq and Blanke [14] for the works on nonparametric regression under independent and dependent data.
Regarding notation, for \(x=(x_{1},\ldots,x_{d})\in R^{d}\), set \(\|x\|=\max(|x_{1}|,\ldots,|x_{d}|)\). Throughout the paper, \(c,c_{1},c_{2},c_{3},\ldots,d,B_{0},B_{1}\) denote some positive constants not depending on n, which may be different in various places, \(\lfloor x\rfloor\) denotes the largest integer not exceeding x, → means to take the limit as \(n\rightarrow\infty\), and \(c_{n}\sim d_{n}\) means that \(\frac{c_{n}}{d_{n}}\rightarrow1\), \(\xrightarrow {\mathscr{D}}\) means the convergence in distribution, and \(X\stackrel{\mathscr{D}}{=}Y\) means that random variables X and Y have the same distribution. A sequence \(\{X_{i},i\geq1\}\) is said to be second-order stationary if \((X_{1},X_{1+k})\stackrel{\mathscr{D}}{=} (X_{i},X_{i+k})\) for \(i\geq1\), \(k\geq1\).
2 Some assumptions
In this section, we list some assumptions.
Assumption 2.1
There exist two positive constants \(\bar{K}>0\) and \(\mu>0\) such that
Assumption 2.2
Let \(S_{f}\) denote the compact support of known density \(f(\cdot)\) of \(X_{1}\). For \(x\in S_{f}\), the function \(m(x)\) is twice differentiable, and there exists a positive constant b such that
The kernel density function is symmetric and satisfies
Assumption 2.3
We assume the data observed \(\{(X_{i},Y_{i}),i\geq1\}\) is an independent and identically distributed stochastic sequence with values in \(R^{d}\times R\). The known density \(f(\cdot)\) of \(X_{1}\) is upon its compact support \(S_{f}\) and such that \(\inf_{x\in S_{f}}f(x)>0\). For \(0<\delta\leq1\), we suppose that
and
Assumption 2.3∗
We assume that the data observed \(\{(X_{i},Y_{i}),i\geq1\}\) is a second-order stationary stochastic sequence with values in \(R^{d}\times R\). The sequence \(\{(X_{i},Y_{i}),i\geq1\}\) is also assumed to be φ-mixing with \(\sum_{n=1}^{\infty}\varphi^{1/2}(n)<\infty\). The known density \(f(\cdot)\) of \(X_{1}\) is upon its compact support \(S_{f}\) and such that \(\inf_{x\in S_{f}}f(x)>0\). Let (2.2) and (2.3) be fulfilled. Moreover, for all \(j\geq1\), we have
where \(f_{j}(x_{1},x_{j+1})\) denotes the joint density of \((X_{1},X_{j+1})\).
Remark 2.1
Assumption 2.1 is a usual condition on the kernel function, and Assumption 2.2 is used to get the convergence rate of \(|E\widehat{m}_{n}(x)-m(x)|\). Assumptions 2.3 and 2.3∗ are the conditions of independent and dependent data \(\{(X_{i},Y_{i}),i\geq 1\}\), respectively. Similarly to Hansen [15], conditions (2.2) and (2.3) are used to control the tail behavior of the conditional expectation \(E(|Y_{1}|^{2+\delta}|X_{1}=x)\), and (2.4) is used to estimate the covariance \(\operatorname{Cov}(Y_{1},Y_{j+1})\).
3 Asymptotic normality of internal estimator \(\widehat{m}_{n}(x)\) with independent and dependent data
In this section, we show some results on asymptotic normality of the internal estimator of a nonparametric regression model with independent and dependent data. Theorem 3.1 is for independent data, and Theorem 3.2 is for φ-mixing data.
Theorem 3.1
Let Assumptions 2.1–2.3 hold, and let \(\lim_{\|u\|\rightarrow\infty}\|u\|^{d}K(u)=0\). Suppose that \(\frac{E(Y_{1}^{2}|X_{1}=x)}{f(x)}\) is positive and continuous at point \(x\in S_{f}\). If \(0< h^{d}\rightarrow0\), \(nh^{d}\rightarrow\infty\), and \(nh^{d+4}\rightarrow0\) as \(n\rightarrow\infty\), then
where \(\sigma^{2}(x)=\frac{E(Y_{1}^{2}|X_{1}=x)}{f(x)}\int_{R^{d}} K^{2}(u)\,du\).
Theorem 3.2
Let the conditions of Theorem 3.1 be fulfilled, where Assumption 2.3 is replaced by Assumption 2.3∗. Then (3.1) holds.
Remark 3.1
The choice of a positive bandwidth h is easy to design. For example, with \(d\geq1\), if \(h=n^{-\beta}\) and \(\beta\in (\frac{1}{d+4},\frac{1}{d})\), then the conditions \(0< h^{d}\rightarrow0\), \(nh^{d}\rightarrow\infty\), and \(nh^{d+4}\rightarrow0\) are satisfied as \(n\rightarrow\infty\).
4 Conclusion
Linton and Jacho-Chávez [6] obtained some asymptotic normality results of the internal estimator \(\widetilde{m}_{n}(x)\) under independent data. Comparing Theorem 1 and Corollary 1 of Linton and Jacho-Chávez [6], our asymptotic normality results on the internal estimator \(\widehat{m}_{n}(x)\) in Theorems 3.1 and 3.2 are relatively simple. Meanwhile, we use the method of Bernstein’s big-block and small-block and the inequalities of φ-mixing random variables to investigate the asymptotic normality of the internal estimator \(\widehat{m}_{n}(x)\) for \(m(x)\), and we also obtain the asymptotic normality result of (3.1). Obviously, α-mixing is weaker than φ-mixing, but some moment inequalities of α-mixing are more complicated than those of φ-mixing [16, 17]. For simplicity, we study the asymptotic normality of internal estimator \(\widehat{m}_{n}(x)\) under φ-mixing and obtain the asymptotic normality result of Theorem 3.2.
5 Some lemmas and the proofs of main results
Lemma 5.1
(Liptser and Shiryayev [18], Theorem 9 in Sect. 5)
Let \((\xi_{nk},\mathscr{H}_{k}^{n})_{k\geq1}\) be martingale differences (i.e. \(\mathscr{H}_{0}^{n}=\{\emptyset,\Omega\}\), \(\mathscr{H}_{k}^{n} \subset\mathscr{H}_{k+1}^{n}\), \(\xi_{nk}\) is an \(\mathscr{H}_{k}^{n}\)-measurable random variable, \(E(\xi_{nk}|\mathscr{H}_{k-1}^{n})=0\) a.s., for all \(k\geq1\) and \(n\geq1\)) with \(E\xi_{nk}^{2}<\infty\) for all \(k\geq1\) and \(n\geq1\). Let \((\gamma _{n})_{n\geq1}\) be a sequence of Markov times with respect to \((\mathscr {H}_{k}^{n})_{k\geq0}\), taking values in the set \(\{0,1,2,\ldots\}\). If
then
Lemma 5.2
(Billingsley [10], Lemma 1)
If ξ is measurable with respect to \(\mathscr{M}^{k}_{-\infty}\) and η is measurable with respect to \(\mathscr{M}^{\infty}_{k+n}\) (\(n\geq0\)), then
implies
Lemma 5.3
(Yang [16], Lemma 2)
Let \(p\geq2\), and let \(\{X_{n}\}_{n\geq1}\) be a φ-mixing sequence with \(\sum_{n=1}^{\infty}\varphi^{1/2}(n)<\infty\). If \(EX_{n}=0\) and \(E|X_{n}|^{p}<\infty\) for all \(n\geq1\), then
where C is a positive constant depending only on \(\varphi(\cdot)\).
Lemma 5.4
(Fan and Yao [13], Proposition 2.6)
Let \(\mathscr{F}_{i}^{j}\) and \(\alpha(\cdot)\) be the same as in (2.57) of Fan and Yao [13]. Let \(\xi_{1},\xi_{2},\ldots,\xi_{k}\) be complex-valued random variables measurable with respect to the σ-algebras \(\mathscr{F}_{i_{1}}^{j_{1}},\ldots,\mathscr{F}_{i_{k}}^{j_{k}}\), respectively. Suppose \(i_{l+1}-j_{l}\geq n\) for \(l=1,\ldots,k-1\) and \(j_{l}\geq i_{l}\) and \(P(|\xi_{l}|\leq1)=1\) for \(l=1,2,\ldots,k\). Then
Proof of Theorem 3.1
It is easy to see that
Combining Assumption 2.2 with the proof of Lemma 2 of Shen and Xie [7], we obtain that
Then, it follows from \(nh^{d+4}\rightarrow0\) that
For \(x\in S_{f}\), let \(Z_{i}:=\sqrt{h^{d}}\frac{Y_{i}K_{h}(x-X_{i})}{f(X_{i})}\), \(1\leq i\leq n\). Denote
To prove (3.1), we apply (5.1)–(5.3) and have to show that
where \(\sigma^{2}(x)\) is defined by (3.1).
Combining the independent and identically distributed stochastic sequence of \(\{(X_{i},Y_{i}), i\geq1\}\) with Lemma 5.1, to prove (5.4), we have to show that
and, for all \(\lambda\in(0,1]\),
Obviously, for any \(1\leq r\leq2+\delta\) (\(0<\delta\leq1\)), by (2.1) and (2.3) we have
By (5.7) with \(r=1\) this yields
Define
In view of condition (2.3), we have
So we have \(g(x)\in L_{1}\). Since that \(\frac{E(Y_{1}^{2}|X_{1}=x)}{f(x)}\) is positive and continuous at a point \(x\in S_{f}\) and \(\lim_{\|u\|\rightarrow\infty}\|u\|^{d}K(u)=0\), we obtain by Bochner lemma [14] that
Then, it follows from (5.8) and (5.9) that, for \(x\in S_{f}\),
which implies (5.6). Meanwhile, for some \(\delta\in(0,1]\) and any \(\lambda\in(0,1]\), by \(C_{r}\) inequality and (5.7) we get that
since \(nh^{d}\rightarrow\infty\). Thus, (5.6) follows from (5.11). Consequently, the proof of the theorem is completed. □
Proof of Theorem 3.2
We use the same notation as in the proof of Theorem 3.1. Under the conditions of Theorem 3.2, by (5.1), (5.2), and (5.3), to prove (3.1), we need to show that
where \(\sigma^{2}(x)\) is defined by (3.1). By the second-order stationarity, \(\{(X_{i},Y_{i}),i\geq1\}\) are identically distributed. Then, for \(1\leq i\leq n\), we have by (5.8) and (5.9) that
For \(j\geq1\), in view of (2.4), we have
So it follows from (5.7) and (5.14) that
Obviously, by the stationarity we establish that
For \(h^{d}\), we can choose \(r_{n}\) satisfying that \(r_{n}\rightarrow\infty\) and \(h^{d}r_{n}\rightarrow0\) as \(n\rightarrow\infty\). So, by (5.15),
By Lemma 5.2 with \(s=r=2\), the condition \(\sum_{n=1}^{\infty}\varphi^{1/2}(n)<\infty\), and (5.9), we can show that
Therefore, by (5.13), (5.16), (5.17), and (5.18), we get that
Next, we employ Bernstein’s big-block and small-block procedure (see Fan and Yao [13] and Masry [19]). Partition the set \(\{1,2,\ldots,n\}\) into \(2k_{n}+1\) subsets with large block of size \(\mu=\mu_{n}\) and small block of size \(\nu=\nu_{n}\) and set
Define \(\mu=\mu_{n}=\lfloor\sqrt{\frac{n}{h^{d}}}\rfloor\) and \(\nu=\nu_{n}=\lfloor\sqrt{nh^{d}}\rfloor\). So we have by \(h^{d}\rightarrow0\) and \(nh^{d}\rightarrow\infty\) that
Define \(\eta_{j}\), \(\xi_{j}\), and \(\zeta_{j}\) as follows:
In view of
we have to show that
Relation (5.25) implies that \(\frac{S_{n}^{\prime\prime}}{\sqrt{n}}\) and \(\frac{S_{n}^{\prime\prime\prime}}{\sqrt{n}}\) are asymptotically negligible, (5.26) shows that the summands \(\{\eta_{j}\}\) in \(S_{n}^{\prime}\) are asymptotically independent, and (5.27)–(5.28) are the standard Lindeberg–Feller conditions for the asymptotic normality of \(S_{n}^{\prime}\) under independence.
First, we prove (5.25). By (5.22) and (5.24) we have
By the stationarity and (5.10), similarly to the proof of (5.17) and (5.18), for \(0\leq j\leq k-1\), we have
Thus it follows from (5.19) and (5.20) that
We consider the term \(F_{2}\) in (5.29). With \(\lambda_{j}=j(\mu_{n}+\nu_{n})+\mu_{n}\),
but since \(i\neq j\), \(|\lambda_{i}-\lambda_{j}+l_{1}-l_{2}|\geq\mu_{n}\) for \(0\leq i< j\leq k-1\), \(1\leq l_{1}\leq\nu_{n}\), and \(1\leq l_{2}\leq\nu_{n}\), similarly to the proof of (5.18), it follows that
Hence by (5.29), (5.31), and (5.32) we have
By (5.13), (5.20), and (5.23), similarly to the proofs of (5.17) and (5.18), we can find that
Thus
Second, it is easy to see that \(\varphi^{1/2}(n)=o(\frac{1}{n})\) by \(\varphi(n)\downarrow0\) and \(\sum_{n=1}^{\infty}\varphi^{1/2}(n)<\infty\). Note that \(\eta_{a}\) is \(\mathscr{M}_{i_{a}}^{j_{a}}\)-measurable with \(i_{a}=a(\mu+\nu)+1\) and \(j_{a}=a(\mu+\nu)+\mu\). Since φ-mixing random variables are strong mixing random variables and \(\alpha(n)\leq\varphi(n)\), letting \(V_{j}=\exp(itn^{-1/2}\eta_{j})\), by Lemma 5.4 we have
by (5.19), (5.20), and the conditions \(h_{n}\rightarrow0\) and \(nh^{d}\rightarrow\infty\) as \(n\rightarrow\infty\).
Third, we show (5.27), where \(\eta_{j}\) is defined in (5.21). By the stationarity and (5.30) with \(\mu_{n}\) replacing \(\nu_{n}\), we have
so that
since \(k_{n}\mu_{n}/n\rightarrow1\).
Fourth, it is time to establish (5.28). Obviously, by (5.7) we obtain that
We can see that \(\frac{\frac{1}{h^{d}}}{\mu_{n}}\leq\frac{c}{h^{d} \sqrt{\frac{n}{h^{d}}}}=\frac{c}{(nh^{d})^{\frac{1}{2}}}\rightarrow0\), since \(nh^{d}\rightarrow\infty\) as \(n\rightarrow\infty\). Therefore, by Lemma 5.3 with \(\sum_{n=1}^{\infty}\varphi^{1/2}(n)<\infty\) we have that
Then, for all \(\varepsilon>0\), by (5.34) and (5.35) it is easy to see that
Similarly, for \(0\leq j\leq k-1\), we get that
Therefore, since \(0<\delta\leq1\) and \(nh^{d}\rightarrow\infty\), we obtain that, for all \(\varepsilon>0\),
Therefore, (5.26), (5.27), and (5.28) hold for \(S_{n}^{\prime}\), so that
Consequently, (5.12) follows from (5.33) and (5.36). Finally, by (5.1), (5.2), and (5.12) we obtain (3.1). The proof of theorem is completed. □
References
Nadaraya, E.A.: On estimating regression. Theory Probab. Appl. 9, 141–142 (1964)
Watson, G.S.: Smooth regression analysis. Sankhya, Ser. A 26, 359–372 (1964)
Jones, M.C., Davies, S.J., Park, B.U.: Versions of kernel-type regression estimators. J. Am. Stat. Assoc. 89, 825–832 (1994)
Mack, Y.P., Müller, H.G.: Derivative estimation in nonparametric regression with random predictor variable. Sankhya 51, 59–72 (1989)
Linton, O., Nielsen, J.: A kernel method of estimating structured nonparametric regression based on marginal integration. Biometrika 82, 93–100 (1995)
Linton, O., Jacho-Chávez, D.: On internally corrected and symmetrized kernel estimators for nonparametric regression. Test 19, 166–186 (2010)
Shen, J., Xie, Y.: Strong consistency of the internal estimator of nonparametric regression with dependent data. Stat. Probab. Lett. 83, 1915–1925 (2013)
Li, X.Q., Yang, W.Z., Hu, S.H.: Uniform convergence of estimator for nonparametric regression with dependent data. J. Inequal. Appl. 2016, 142 (2016)
Dobrushin, R.L.: The central limit theorem for non-stationary Markov chain. Theory Probab. Appl. 1, 72–88 (1956)
Billingsley, P.: Convergence of Probability Measures. Wiley, New York (1968)
Györfi, L., Härdle, W., Sarda, P., Vieu, P.: Nonparametric Curve Estimation from Time Series. Springer, Berlin (1989)
Györfi, L., Kohler, M., Krzyżak, A., Walk, H.: A Distribution-Free Theory of Nonparametric Regression. Springer, New York (2002)
Fan, J.Q., Yao, Q.W.: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York (2003)
Bosq, D., Blanke, D.: Inference and Prediction in Large Dimensions. Wiley, Chichester (2007)
Hansen, B.E.: Uniform convergence rates for kernel estimation with dependent data. Econom. Theory 24, 726–748 (2008)
Yang, S.C.: Almost sure convergence of weighted sums of mixing sequences. J. Syst. Sci. Math. Sci. 15, 254–265 (1995)
Yang, S.C.: Maximal moment inequality for partial sums of strong mixing sequences and application. Acta Math. Sin. Engl. Ser. 23, 1013–1024 (2007)
Liptser, R.S., Shiryayev, A.N.: Theory of Martingales. Kluwer Academic, Dordrecht (1989)
Masry, E.X.: Nonparametric regression estimation for dependent functional data: asymptotic normality. Stoch. Process. Appl. 115, 155–177 (1989)
Funding
This work is supported by National Natural Science Foundation of China (11501005, 11701004, 61403115), Common Key Technology Innovation Special of Key Industries (cstc2017zdcy-zdyf0252), Artificial Intelligence Technology Innovation Significant Theme Special Project (cstc2017rgzn-zdyf0073, cstc2017rgzn-zdyf0033), Natural Science Foundation of Chongqing (cstc2018jcyjA0607), Natural Science Foundation of Anhui (1808085QA03, 1808085QF212, 1808085QA17) and Provincial Natural Science Research Project of Anhui Colleges (KJ2016A027, KJ2017A027).
Author information
Authors and Affiliations
Contributions
All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Li, P., Li, X. & Chen, L. The asymptotic normality of internal estimator for nonparametric regression. J Inequal Appl 2018, 231 (2018). https://doi.org/10.1186/s13660-018-1832-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-018-1832-6