- Research
- Open access
- Published:
Nonlinear wavelet density estimation for biased data in Sobolev spaces
Journal of Inequalities and Applications volume 2013, Article number: 308 (2013)
Abstract
In this paper, we consider the density estimation problem from independent and identically distributed (i.i.d.) biased observations. We develop an adaptive wavelet hard thresholding rule and evaluate its performance by considering risk over Sobolev balls. We prove that our estimation attains a sharp rate of convergence and show the optimality.
MSC:49K40, 90C29, 90C31.
1 Introduction
In practice, it usually happens that drawing a direct sample from a random variable X is impossible. In this paper, we consider the problem of estimating the density functions without observing directly the i.i.d. sample . We observe the samples from biased data with the following density function:
where is the so-called weight or bias function, . The purpose of this paper is to estimate the density function from the samples .
Several examples of this biased data can be found in the literature. For instance, in paper [1], it is shown that the distribution of the concentration of alcohol in the blood of intoxicated drivers is of interest, since the drunken driver has a larger chance of being arrested, the collected data are size-biased.
The density estimation problem for biased data (1.1) has been discussed in several papers. In 1982, Vardi [2] considered the nonparametric maximum likelihood estimation for . In 1991, Jones [3] discussed the mean squared error properties of the kernel density estimation. In 2004, Efromovich [4] developed the Efromovich-Pinsker adaptive Fourier estimator. It was based on a blockwise shrinkage algorithm and achieved the minimax rate of convergence under the risk over a Besov class .
In 2010, RamÃrez and Vidakovic [5] proposed a linear wavelet estimator and discussed the consistency of a function in under the mean integrated squared error (MISE) sense. But the wavelet estimator in paper [5] contained the unknown parameter μ. In the same year, Christophe [6] constructed a nonlinear wavelet estimator and evaluated the risk in the Besov space . However, Sobolev spaces () except is not a special case in the Besov space .
In this paper, we consider the nonlinear hard thresholding wavelet density estimation for biased data in Sobolev spaces (). We mainly give the upper bound of minimax rate of convergence under the risk without particular restriction on the parameters r and p, and the convergence rate is optimal.
2 Preliminaries
In this section, we shall recall some well-known concepts and lemmas.
2.1 Wavelets
In this paper, we always assume that the scaling wavelet φ is orthonormal, compactly supported and regular.
Definition 2.1 The scaling function is called m regular if has continuous derivatives of order m and its corresponding wavelet has vanishing moments of order m, i.e., , .
The following conditions about the scaling function φ and the kernel function will be very useful in the third section.
Condition (θ)
The function is such that .
Condition
There exists an integrable function such that for any , , where .
Condition
Condition is satisfied and , , .
For any , , denoted by , , then for any , where , we have the following equation [7]:
where
2.2 Sobolev space
The Sobolev space () is defined by , which is equipped with the norm . The Sobolev balls are defined as follows:
Between a Sobolev space and a Besov space, the following embedding conclusions are established.
Lemma 2.1 [8]
Let , , then
-
(i)
, ;
-
(ii)
, , ,
where denotes that the Banach space A is continuously embedding in the Banach space B, i.e., there exists a constant such that for any , we have .
2.3 Auxiliary lemmas
The following lemmas given by [9] will be used in the next section.
Lemma 2.2 If the scaling function φ satisfies Condition (θ), then for any sequence satisfying , we have , where , , , .
Lemma 2.3 For some integer , if the kernel function satisfies Conditions and , , where , , then we have , where .
Lemma 2.4 (Rosenthal inequality)
Let be independent random variables such that and , then there exists a constant such that
Lemma 2.5 (Bernstein inequality)
Let be independent random variables such that , , . Then
Remark In this paper, we often use the notation to indicate that with a positive constant c, which is independent of A and B. If and , we write .
3 Main results
In this paper, our hard thresholding wavelet density estimator is defined as follows:
where
The hard thresholding wavelet coefficients are , where
Suppose that the parameters , , λ of the wavelet thresholding estimator (3.1) satisfy the assumptions:
where c is a suitably chosen positive constant.
Lemma 3.1 Suppose that there exist two constants and such that for . Let , be the coefficients in the expansion (2.1) and let , be defined by estimator in (3.1). If , then for any , we have
-
(i)
;
-
(ii)
.
Proof (i) From the definition of and the triangular inequality, we have
Since , we have
and
Furthermore, a Sobolev space and a Besov space have the following embedding theorem, , for any integer , then we have . Therefore, by the convexity inequality, we get
where , .
The term is estimated as follows. Firstly, let , we can see that they are i.i.d., and . Moreover, for any ,
where
and
So, we have
Since , we obtain
By Rosenthal’s inequality, we have
To estimate the term , let . We can compute easily, and for any , .
If , i.e., , using Rosenthal’s inequality, we have
If , we get
By (3.5), (3.6) and (3.7), we obtain
-
(ii)
It is similar to (i), we omit it. □
Lemma 3.2 If , then for any , there exists a constant such that
Proof We can easily get
Therefore,
where , . So, we get
where , .
Now, we estimate . Clearly, , and
Furthermore, we have
By Bernstein’s inequality, we obtain
Since , then
Taking such that , then
Next, we estimate . We compute that , i.e.,
and
By Bernstein’s inequality, we obtain
Since , then
Taking such that , we have
Taking , by (3.9) and (3.10), we have
 □
Lemma 3.3 Suppose that there exist two constants and such that , for , and , are given by (3.1). Then
where , are constants.
Proof By Lemma 2.2, we obtain
Furthermore, since , we have
Note that
and if , , we get , i.e., ; therefore, we have
where
-
(i)
Firstly, we estimate
By Lemma 3.1, we have
Using and Jensen’s inequality, we obtain
By and , we have
Using Lemma 3.1 and (3.4), we obtain
where , are constants.
-
(ii)
For
let . By (), we have
Since , then . Taking , , we have
Note that if and only if . When , i.e., , we can compute . Using (3.2), (3.3), we obtain
-
(iii)
Finally, we estimate
Let , and . Using Jensen’s inequality and Hölder’s inequality, we have
By Lemma 3.1 and Lemma 3.2, we obtain
Taking large enough ω such that , we get
Taking as in (3.2), we have
Putting (3.11), (3.12) and (3.13) together, we can obtain
where , are constants. □
Theorem 3.4 Let the scaling function be orthonormal, compactly supported and regular. There exist two positive constants and such that , . If is the nonlinear wavelet estimator in (3.1), and assumptions (3.2), (3.3) and (3.4) are satisfied, then for any , where , , we have
where , , are constants.
Proof By the definition of in (3.1) and the expansion of in (2.1), one has
Then
where
Firstly, we estimate
By Lemma 2.2 and Jensen’s inequality,
Since and are compactly supported, then the number of elements in is . By Lemma 3.1, we have .
Therefore
Using (3.2), we have
where .
Next, we estimate
In reference [9], it turns out that if the scaling function is orthonormal, compactly supported and regular, then the associated kernel function satisfies Conditions and , and .
Since a Sobolev space and a Besov space have the following embedding theorem: , where , then . By Lemma 2.3, we have
Taking as in (3.3), we have
Finally, we estimate
Using Lemma 3.3, we obtain
By (3.14), (3.15) and (3.16), we obtain
 □
4 Optimality
Now, we discuss the optimality of the rates of convergence. Using similar techniques as those in reference [10], we can obtain the following lower bound theorem.
Theorem 4.1 Let the scaling function be orthonormal, compactly supported and regular, . If there exist two positive constants and such that , , then for any estimator , we have
where , .
Remark The proof is very similar to that in reference [10], in which the author studied the lower bound of the convergence rates in Besov spaces for the samples without bias data.
According to Theorem 4.1, we can see that:
-
(i)
When , our nonlinear estimator can attain the optimal rate.
-
(ii)
When , our convergence rate and the optimal rate of convergence differ in a logarithmic. So, it is sub-optimal.
-
(iii)
When , the logarithmic factor is an extra penalty for the chosen wavelet thresholding, our convergence rate is sub-optimal.
References
Efromovich S Springer Series in Statistics. In Nonparametric Curve Estimation. Methods, Theory, and Applications. Springer, New York; 1999.
Vardi Y: Nonparametric estimation in the presence of length bias. Ann. Stat. 1982, 10(2):616–620.
Jones MC: Kernel density estimation for length-biased data. Biometrika 1991, 78(3):511–519.
Efromovich S: Density estimation for biased data. Ann. Stat. 2004, 32: 1137–1161.
Ramirez P, Vidakovic B: Wavelet density estimation for stratified size-biased sample. J. Stat. Plan. Inference 2010, 140(2):419–432.
Christophe C: Wavelet block thresholding for density estimation in the presence of bias. J. Korean Stat. Soc. 2010, 39: 43–53.
Kelly C, Kon MA, Rapheal LA: Local convergence for wavelet expansion. J. Funct. Anal. 1994, 126: 102–138.
Triebel H: Theory of Function Spaces. Birkhäuser, Basel; 1983.
Härdle W, Kerkyacharian G, Picard D, Tsybakov A: Wavelets, Approximation and Statistical Applications. Springer, Berlin; 1997.
Wang HY: Convergence rates of density estimation in Besov spaces. Appl. Math. 2011, 2(10):1258–1262.
Acknowledgements
This paper is supported by the National Natural Science Foundation of China (No. 11271038) and Foundation of BJUT (No. 006000542213501).
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
WJR participated in the sequence alignment and drafted the manuscript. WM participated in the design of the study and performed the statistical analysis. ZY conceived of the study and participated in its design and coordination. All authors read and approved the final manuscript.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Wang, J., Wang, M. & Zhou, Y. Nonlinear wavelet density estimation for biased data in Sobolev spaces. J Inequal Appl 2013, 308 (2013). https://doi.org/10.1186/1029-242X-2013-308
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1029-242X-2013-308