- Research
- Open Access
- Published:
Wavelet density estimation for mixing and size-biased data
Journal of Inequalities and Applications volume 2018, Article number: 189 (2018)
Abstract
This paper considers wavelet estimation for a multivariate density function based on mixing and size-biased data. We provide upper bounds for the mean integrated squared error (MISE) of wavelet estimators. It turns out that our results reduce to the corresponding theorem of Shirazi and Doosti (Stat. Methodol. 27:12–19, 2015), when the random sample is independent.
1 Introduction
Let \(\{Y_{i}, i\in\mathbb{Z}\}\) be a strictly stationary random process defined on a probability space \((\Omega, \mathcal{F},P)\) with the common density function
where ω denotes a known positive function, f stands for an unknown density function of the unobserved random variable X and \(\mu =E \omega(X)=\int_{\mathbb{R}^{d}}\omega(y)f(y)\, dy<+\infty\). We want to estimate the unknown density function f from a sequence of strong mixing data \(Y_{1}, Y_{2}, \ldots, Y_{n}\).
When \(Y_{1}, Y_{2}, \ldots, Y_{n}\) are independent and \(d=1\), Ramírez and Vidakovic [13] propose a linear wavelet estimator and show it to be \(L^{2}\) consistent; Chesneau [1] considers the optimal convergence rates of wavelet block thresholding estimator; Shirazi and Doosti [16] expand Ramírez and Vidakovic’s [13] work to \(d\geq1\). Chesneau et al. [2] extend the independence to both positively and negatively associated cases. They show a convergence rate for mean integrated squared error (MISE). An upper bound of wavelet estimation on \(L^{p}\) (\(1\leq p<+\infty\)) risk in negatively associated case is given by Liu and Xu [9].
This paper deals with the d-dimensional density estimate problem (1), when \(Y_{1}, Y_{2}, \ldots, Y_{n}\) are strong mixing. We give upper bounds for the mean integrated squared error (MISE) of wavelet estimators. It turns out that our linear result reduces to Shirazi and Doosti’s [16] theorem, when the random sample is independent.
1.1 Wavelets and Besov spaces
As a central notion in wavelet analysis, Multiresolution Analysis (MRA, Meyer [11]) plays an important role in constructing a wavelet basis, which means a sequence of closed subspaces \(\{V_{j}\}_{j\in \mathbb{Z}}\) of the square integrable function space \(L^{2}(\mathbb {R}^{d})\) satisfying the following properties:
-
(i)
\(V_{j}\subseteq V_{j+1}\), \(j\in\mathbb{Z}\). Here and after, \(\mathbb {Z}\) denotes the integer set and \(\mathbb{N}:=\{n\in\mathbb{Z}, n\geq0\}\);
-
(ii)
\(\overline{\bigcup_{j\in\mathbb{Z}} V_{j}}=L^{2}(\mathbb {R}^{d})\). This means the space \(\bigcup_{j\in\mathbb{Z}} V_{j}\) being dense in \(L^{2}(\mathbb{R}^{d})\);
-
(iii)
\(f(2\cdot)\in V_{j+1}\) if and only if \(f(\cdot)\in V_{j}\) for each \(j\in\mathbb{Z}\);
-
(iv)
There exists a scaling function \(\varphi\in L^{2}(\mathbb{R}^{d})\) such that \(\{\varphi(\cdot-k),k\in\mathbb{Z}^{d}\}\) forms an orthonormal basis of \(V_{0}=\overline{\operatorname{span}}\{\varphi(\cdot-k)\}\).
When \(d=1\), there is a simple way to define an orthonormal wavelet basis. Examples include the Daubechies wavelets with compact supports. For \(d\geq2\), the tensor product method gives an MRA \(\{V_{j}\}\) of \(L^{2}(\mathbb{R}^{d})\) from one-dimensional MRA. In fact, with a scaling function φ of tensor products, we find \(M=2^{d}-1\) wavelet functions \(\psi^{\ell}\) (\(\ell=1,2,\ldots,M\)) such that, for each \(f\in L^{2}(\mathbb{R}^{d})\), the following decomposition
holds in \(L^{2}(\mathbb{R}^{d})\) sense, where \(\alpha_{j_{0},k}=\langle f,\varphi_{j_{0},k}\rangle\), \(\beta_{j,k}^{\ell}=\langle f,\psi _{j,k}^{\ell}\rangle\) and
Let \(P_{j}\) be the orthogonal projection operator from \(L^{2}(\mathbb {R}^{d})\) onto the space \(V_{j}\) with the orthonormal basis \(\{\varphi _{j,k}(\cdot)=2^{jd/2}\varphi(2^{j}\cdot-k),k\in\mathbb{Z}^{d}\}\). Then, for \(f\in L^{2}(\mathbb{R}^{d})\),
A wavelet basis can be used to characterize Besov spaces. The next lemma provides equivalent definitions for those spaces, for which we need one more notation: a scaling function φ is called m-regular if \(\varphi\in C^{m}(\mathbb{R}^{d})\) and \(|D^{\alpha }\varphi(y)|\leq c(1+|y|^{2})^{-\ell}\) for each \(\ell\in\mathbb{Z}\) and each multi-index \(\alpha\in\mathbb{N}^{d}\) with \(|\alpha|\le m\).
Lemma 1.1
(Meyer [11])
Let φ be m-regular, \(\psi^{\ell} \) (\(\ell=1, 2, \ldots, M\), \(M=2^{d}-1 \)) be the corresponding wavelets and \(f\in L^{p}(\mathbb{R}^{d})\). If \(\alpha_{j,k}=\langle f,\varphi_{j,k} \rangle\), \(\beta_{j,k}^{\ell}=\langle f,\psi_{j,k}^{\ell } \rangle\), \(p,q\in[1,\infty]\), and \(0< s< m\), then the following assertions are equivalent:
-
(1)
\(f\in B^{s}_{p,q}(\mathbb{R}^{d})\);
-
(2)
\(\{2^{js}\|P_{j+1}f-P_{j}f\|_{p}\}\in l_{q}\);
-
(3)
\(\{2^{j(s-\frac{d}{p}+\frac{d}{2})}\|\beta_{j}\|_{p}\}\in l_{q}\).
The Besov norm of f can be defined by
1.2 Estimators and result
In this paper, we require \(\operatorname{supp} Y_{i} \subseteq[0,1]^{d}\) in model (1). This is similar to Chesneau [1], Chesneau et al. [2], Liu and Xu [9]. Now we give the definition of strong mixing.
Definition 1.1
(Rosenblatt [15])
A strictly stationary sequence of random vectors \(\{Y_{i}\}_{i\in\mathbb{Z}}\) is said to be strong mixing if
where \(\digamma^{0}_{-\infty} \) denotes the σ field generated by \(\{Y_{i}\}_{i \leq0}\) and \(\digamma^{\infty}_{k} \) does by \(\{ Y_{i}\}_{i \geq k}\).
Obviously, the independent and identically distributed (i.i.d.) data are strong mixing since \(\mathbb{P} (A\cap B)=\mathbb{P}(A) \mathbb{P} (B)\) and \(\alpha(k)\equiv0\) in that case. Now, we provide two examples for strong mixing data.
Example 1
Let \(X_{t}=\sum_{j\in\mathbb{Z}}a_{j}\varepsilon_{t-j}\) with
Then it can be proved by Theorem 2 and Corollary 1 of Doukhan [5] on p. 58 that \(\{X_{t}, t\in\mathbb{Z}\}\) is a strong mixing sequence.
Example 2
Let \(\{\varepsilon(t),t\in\mathbb{Z}\}\overset {\mathrm{i.i.d.}}{\sim} N_{r}(\vec{0},\Sigma)\) (r-dimensional normal distribution) and \(\{Y(t), t\in\mathbb{Z}\}\) satisfy the auto-regression moving average equation
with \(l\times r\) and \(l\times l\) matrices \(A(k)\), \(B(i)\) respectively, as well as \(B(0)\) being the identity matrix. If the absolute values of the zeros of the determinant \(\operatorname{det} P(z):=\operatorname{det}\sum_{i=0}^{p}B(i)z^{i}\) (\(z\in\mathbb{C}\)) are strictly greater than 1, then \(\{Y(t), t\in\mathbb{Z}\}\) is strong mixing (Mokkadem [12]).
It is well known that a Lebesgue measurable function maps i.i.d. data to i.i.d. data. When dealing with strong mixing data, it seems necessary to require the functions ω in (1) to be Borel measurable. A Borel measurable function f on \(\mathbb{R}^{d}\) means \(\{y\in\mathbb{R}^{d}, f(y)>c\}\) being a Borel set for each \(c\in\mathbb {R}\). In that case, we can prove easily that \(\{f(Y_{i})\}\) remains strong mixing and \(\alpha_{f(Y)}(k)\leq\alpha_{Y}(k)\) (\(k=1, 2, \ldots\)) if \(\{Y_{i}\}\) has the same property, see Guo [6]. This note is important for the proofs of the lemmas in the next section.
Before introducing our estimators, we formulate the following assumptions:
-
A1.
The weight function ω has both positive upper and lower bounds, i.e., for \(y\in[0,1]^{d}\),
$$0< c_{1}\leq\omega(y)\leq c_{2}< +\infty. $$ -
A2.
The strong mixing coefficient of \(\{Y_{i}, i=1, 2, \ldots, n\}\) satisfies \(\alpha(k)=O(\gamma e^{-c_{3}k})\) with \(\gamma>0\), \(c_{3}>0\).
-
A3.
The density \(f_{(Y_{1}, Y_{k+1})}\) of \((Y_{1}, Y_{k+1})\) (\(k\geq1\)) and the density \(f_{Y_{1}}\) of \(Y_{1}\) satisfy that for \((y, y^{*})\in [0,1]^{d}\times[0,1]^{d}\),
$$\sup_{k\geq1}\sup_{(y,y^{*})\in[0,1]^{d}\times [0,1]^{d}} \bigl\vert h_{k}\bigl(y,y^{*}\bigr) \bigr\vert \leq c_{4}, $$where \(h_{k}(y, y^{*})=f_{(Y_{1}, Y_{k+1})}(y, y^{*})-f_{Y_{1}}(y)f_{Y_{k+1}}(y^{*})\) and \(c_{4}>0\).
Assumption A1 is standard for the nonparametric density model with size-biased data, see Ramírez and Vidakovic [13], Chesneau [1], Liu and Xu [9]. Condition A3 can be viewed as a ‘Castellana–Leadbetter’ type condition in Masry [10].
We choose a d-dimensional scaling function
with \(D_{2N}(\cdot)\) being the one-dimensional Daubechies scaling function. Then φ is m-regular (\(m>0\)) when N gets large enough. Note that \(D_{2N}\) has compact support \([0,2N-1]\) and the corresponding wavelet has compact support \([-N+1,N]\). Then, for \(f\in L^{2}(\mathbb{R}^{d})\) with \(\operatorname{supp} f\subseteq[0,1]^{d}\) and \(M=2^{d}-1\),
where \(\Lambda_{j_{0}}=\{1-2N, 2-2N, \ldots, 2^{j_{0}}\}^{d}\), \(\Lambda _{j}=\{-N, -N+1, \ldots, 2^{j}+N-1\}^{d}\) and
We introduce
and
Now, we define our linear wavelet estimator
and the nonlinear wavelet estimator
with \(t_{n}:=\sqrt{\frac{\ln n}{n}}\). The positive integers \(j_{0}\) and \(j_{1}\) are specified in the theorem, while the constant κ will be chosen in the proof of the theorem.
The following notations are needed to state our theorem: For \(H>0\),
and \(x_{+}:=\max\{x,0\}\). In addition, \(A\lesssim B\) denotes \(A\leq cB\) for some constant \(c>0\); \(A\gtrsim B\) means \(B\lesssim A\); \(A\sim B\) stands for both \(A\lesssim B\) and \(B\lesssim A\).
Main theorem
Consider the problem defined by (1) under assumptions A1–A3. Let \(f\in B^{s}_{p,q}(H)\) (\(p,q\in[1,\infty)\), \(s>\frac {d}{p}\)) and \(\operatorname{supp} f\subseteq[0,1]^{d}\). Then the linear wavelet estimator \(\widehat{f}^{\mathrm{lin}}_{n}\) defined in (6) with \(2^{j_{0}}\sim n^{\frac{1}{2s'+d}}\) and \(s'=s-d(\frac{1}{p}-\frac {1}{2})_{+}\) satisfies
the nonlinear estimator in (7) with \(2^{j_{0}}\sim n^{\frac{1}{2m+d}}\) (\(m>s\)), \(2^{j_{1}}\sim(\frac{n}{(\ln n)^{3}})^{\frac{1}{d}}\) satisfies
Remark 1
When \(d=1\), \({n}^{-\frac{2s}{2s+1}}\) is the optimal convergence rate in the minimax sense for the standard nonparametric density model, see Donoho et al. [4].
Remark 2
When the strong mixing data \(Y_{1}, Y_{2}, \ldots, Y_{n}\) reduce to independent and identically distributed (i.i.d.) data, the convergence rate of our linear estimator is the same as that of Theorem 3.1 in Shirazi and Doosti [16].
Remark 3
Compared with the linear wavelet estimator \(\widehat {f}^{\mathrm{lin}}_{n}\), the nonlinear estimator \(\widehat{f}^{\mathrm{non}}_{n}\) is adaptive, which means both \(j_{0}\) and \(j_{1}\) do not depend on s, p, and q. On the other hand, the convergence rate of the nonlinear estimator remains the same as that of the linear one up to \((\ln n)^{3}\), when \(p\geq2\). However, it gets better for \(1\leq p<2\).
2 Some lemmas
In this section, we provide some lemmas for the proof of the theorem. The following simple (but important) lemma holds.
Lemma 2.1
For the model defined in (1),
where \(\alpha_{j_{0},k}=\int_{[0, 1]^{d}}f(y)\varphi_{j_{0},k}(y)\,dy\) and \(\beta_{j,k}^{\ell}=\int_{[0, 1]^{d}}f(y)\psi_{j,k}^{\ell}(y)\,dy\) (\(\ell=1,2,\ldots, M\)).
Proof
One includes a simple proof for completeness. By (3),
This with (1) leads to
which concludes (9a). Using (1), one knows that
This completes the proof of (9b). Similar arguments show (9c). □
To estimate \(E |\widehat{\alpha}_{j_{0},k}-\alpha_{j_{0},k} |^{2}\) and \(E |\widehat{\beta}^{\ell}_{j,k}-\beta^{\ell}_{j,k} |^{2}\), we introduce an important inequality, which can be found in Davydov [3].
Davydov’s inequality
Let \(\{Y_{i}\}_{i\in\mathbb{Z}}\) be strong mixing with mixing coefficient \(\alpha(k)\), f and g be two measurable functions. If \(E|f(Y_{1})|^{p}\) and \(E|g(Y_{1})|^{q}\) exist for \(p, q>0\) and \(\frac{1}{p}+\frac{1}{q}<1\), then there exists a constant \(c>0\) such that
Lemma 2.2
Let \(f\in B^{s}_{p,q}(H)\) (\(p,q\in[1,\infty)\), \(s>\frac{d}{p}\)) and \(\widehat{\alpha}_{j_{0},k}\), \(\widehat{\beta}^{\ell }_{j,k}\) be defined by (4) and (5). If A1–A3 hold, then
Proof
One proves the second inequality only, the first one is similar. By the definition of \(\widehat{\beta}^{\ell}_{j,k}\),
and \(E \vert \widehat{\beta}^{\ell}_{j,k}-\beta^{\ell}_{j,k} \vert ^{2}\lesssim E \vert \frac{\widehat{\mu}_{n}}{\mu} [\frac{\mu}{n}\sum_{i=1}^{n}\frac{\psi^{\ell}_{j,k}(Y_{i})}{\omega(Y_{i})}-\beta^{\ell }_{j,k} ] \vert ^{2} +E \vert \beta^{\ell}_{j,k}\widehat{\mu}_{n} (\frac{1}{\mu}-\frac {1}{\widehat{\mu}_{n}} ) \vert ^{2}\). Note that \(B_{p,q}^{s}(\mathbb{R}^{d})\subseteq B_{\infty,\infty }^{s-\frac{d}{p}}(\mathbb{R}^{d})\) with \(s>\frac{d}{p}\). Then \(f\in B_{\infty,\infty}^{s-\frac{d}{p}}(\mathbb{R}^{d})\) and \(\|f\|_{\infty }\lesssim1\). Moreover, \(\vert \beta^{\ell}_{j,k} \vert := \vert \int _{[0,1]^{d}}f(y) \psi^{\ell}_{j,k}(y)\,dy \vert \lesssim1\) thanks to Hölder’s inequality and orthonormality of \(\{\psi^{\ell}_{j,k}\}\). On the other hand, \(\vert \frac{\widehat{\mu}_{n}}{\mu} \vert \lesssim1\) and \(|\widehat{\mu}_{n}|\lesssim1\) because of A1. Hence,
It follows from Lemma 2.1 and the definition of variance that
Note that Condition A1 implies \(\operatorname{var} (\frac{1}{\omega (Y_{i})} ) \leq E (\frac{1}{\omega(Y_{i})} )^{2}\lesssim 1\) and
Then it suffices to show
By the strict stationarity of \(Y_{i}\),
On the other hand, Davydov’s inequality and A1 show that
These with A2 give the desired conclusion (12),
Now, the main work is to show
Clearly,
By A1–A3 and (1), the first term of the above inequality is bounded by
It remains to show
where the assumption \(2^{jd}\leq n\) is needed.
According to A1 and A3,
Hence,
On the other hand, Davydov’s inequality and A1–A3 tell that
Moreover, \(\sum_{m=2^{jd}}^{n} \vert \operatorname{cov} (\frac{\psi^{\ell }_{j,k}(Y_{1})}{\omega(Y_{1})},\frac{\psi^{\ell}_{j,k}(Y_{m+1})}{\omega (Y_{m+1})} ) \vert \lesssim\sum_{m=2^{jd}}^{n}\sqrt{\alpha(m)} 2^{\frac{jd}{2}} \lesssim\sum_{m=1}^{n}\sqrt{m\alpha(m)}\leq \sum_{m=1}^{+\infty }m^{\frac{1}{2}}\gamma e^{-\frac{cm}{2}}<+\infty\). This with (15) shows (14). □
To prove the last lemma in this section, we need the following Bernstein-type inequality (Liebscher [7, 8], Rio [14]).
Bernstein-type inequality
Let \((Y_{i})_{i\in\mathbb{Z}}\) be a strong mixing process with mixing coefficient \(\alpha(k)\), \(EY_{i}=0\), \(|Y_{i}|\leq M<\infty\), and \(D_{m}=\max_{1\leq j\leq 2m}\operatorname{var} (\sum_{i=1}^{j}Y_{i} )\). Then, for \(\varepsilon >0\) and \(n,m\in\mathbb{N}\) with \(0< m\leq\frac{n}{2}\),
Lemma 2.3
Let \(f\in B^{s}_{p,q}(H)\) (\(p,q\in[1,\infty)\), \(s>\frac{d}{p}\)), \(\widehat{\beta}^{\ell}_{j,k}\) be defined in (5) and \(t_{n}=\sqrt{\frac{\ln n}{n}}\). If A1–A3 hold and \(2^{jd}\leq\frac {n}{(\ln n)^{3}}\), then there exists a constant \(\kappa>1\) such that
Proof
According to the arguments of (10), \(\vert \widehat{\beta}_{j,k}^{\ell}-\beta_{j,k}^{\ell} \vert \lesssim \frac{1}{n} \vert \sum_{i=1}^{n} [\frac{1}{\omega(Y_{i})}-\frac {1}{\mu} ] \vert + \vert \frac{1}{n} \sum_{i=1}^{n}\frac{\mu\psi_{j,k}^{\ell }(Y_{i})}{\omega(Y_{i})}-\beta_{j,k}^{\ell} \vert \). Hence, it suffices to prove
One shows the second inequality only, because the first one is similar and even simpler.
Define \(\eta_{i}:=\frac{\mu\psi_{j,k}^{\ell}(Y_{i})}{\omega (Y_{i})}-\beta_{j,k}^{\ell}\). Then \(E(\eta_{i})=0\) thanks to (9c), and \(\eta_{1}, \ldots, \eta_{n}\) are strong mixing with the mixing coefficients \(\alpha(k)\leq\gamma e^{-ck}\) because of Condition A2. By A1–A3, \(\vert \frac{\mu\psi_{j,k}^{\ell}(Y_{i})}{\omega(Y_{i})} \vert \lesssim2^{\frac{jd}{2}}\) and
According to the arguments of (13), \(D_{m}=\max_{1\leq j\leq2m}\operatorname{var} (\sum_{i=1}^{j}\eta_{i} ) \lesssim m\). Then it follows from Bernstein-type inequality with \(m=u\ln n\) (the constant u will be chosen later on) that
Clearly, \(64 \frac{2^{\frac{jd}{2}}}{\kappa n t_{n}}n\gamma e^{-cm}\lesssim n e^{-cu\ln n}\) holds due to \(t_{n}=\sqrt{\frac{\ln n}{n}}\), \(2^{jd}\leq\frac{n}{(\ln n)^{3}}\) and \(m=u\ln n\). Choose u such that \(1-cu<-4\), then the second term of (17) is bounded by \(n^{-4}\). On the other hand, the first one of (17) has the following upper bound:
thanks to \(D_{m}\lesssim m\), \(2^{jd}\leq\frac{n}{(\ln n)^{3}}\) and \(m=u\ln n\). Obviously, there exists sufficiently large \(\kappa>1\) such that \(\exp \{-\frac{\kappa^{2}\ln n}{64} (1+\frac{1}{6}\kappa u )^{-1} \}\lesssim n^{-4}\). Finally, the desired conclusion (16) follows. □
3 Proof of the theorem
This section proves the theorem. The main idea of the proof comes from Donoho et al. [4].
Proof of (8a)
Note that
It is easy to see that
According to Lemma 2.2, \(|\Lambda_{j_{0}}|\sim2^{j_{0}d}\) and \(2^{j_{0}}\sim n^{\frac{1}{2s'+d}}\),
When \(p\geq2\), \(s'=s\). By Hölder’s inequality, \(f\in B_{p,q}^{s}(H)\), and Lemma 1.1,
When \(1\leq p<2\) and \(s>\frac{d}{p}\), \(B_{p,q}^{s}(\mathbb {R}^{d})\subseteq B_{2,\infty}^{s'}(\mathbb{R}^{d})\). Then it follows from Lemma 1.1 and \(2^{j_{0}}\sim n^{\frac{1}{2s'+d}}\) that
This with (20) shows in both cases
□
Proof of (8b)
By the definitions of \(\widehat{f}^{\mathrm{lin}}_{n}\) and \(\widehat{f}^{\mathrm{non}}_{n}\), \(\widehat{f}^{\mathrm{non}}_{n}(y)-f(y)= [\widehat {f}^{\mathrm{lin}}_{n}(y)-P_{j_{0}}f(y) ]- [f(y)-P_{j_{1}+1}f(y) ] +\sum_{j=j_{0}}^{j_{1}} \sum_{\ell=1}^{M}\sum_{k\in\Lambda j} [\widehat{\beta}_{j,k}^{\ell}I_{\{|\widehat{\beta}_{j,k}^{\ell}|\geq \kappa t_{n}\}}-\beta_{j,k}^{\ell} ]\psi_{j,k}^{\ell}(y)\). Hence,
where \(T_{1}:=E \|\widehat{f}^{\mathrm{lin}}_{n}-P_{j_{0}}f \|^{2}_{2}\), \(T_{2}:= \|f-P_{j_{1}+1}f \|^{2}_{2}\) and
According to (19) and \(2^{j_{0}}\sim n^{\frac{1}{2m+d}}\) (\(m>s\)),
When \(p\geq2\), the same arguments as (20) shows \(T_{2}= \|f-P_{j_{1}+1}f \|^{2}_{2}\lesssim2^{-2j_{1}s}\). This with \(2^{j_{1}}\sim (\frac{n}{(\ln n)^{3}} )^{\frac{1}{d}}\) leads to
On the other hand, \(B_{p,q}^{s}(\mathbb{R}^{d})\subseteq B_{2,\infty }^{s+d/2-d/p}(\mathbb{R}^{d})\) when \(1\leq p<2\) and \(s>\frac{d}{p}\). Then
Hence,
for each \(1\leq p<+\infty\).
The main work for the proof of (8b) is to show
Note that
where
For \(Q_{1}\), one observes that
thanks to Hölder’s inequality. By Lemmas 2.1–2.3 and \(2^{jd}\leq n\),
Then \(Q_{1}\lesssim\sum_{j=j_{0}}^{j_{1}}\frac{2^{jd}}{n^{2}}\lesssim \frac{2^{j_{1}d}}{n^{2}}\lesssim\frac{1}{n}\leq n^{-\frac{2s}{2s+d}}\), where one uses the choice \(2^{j_{1}}\sim (\frac{n}{(\ln n)^{3}} )^{\frac{1}{d}}\). Hence,
To estimate \(Q_{2}\), one defines
It is easy to see that \(2^{j_{0}}\sim n^{\frac{1}{2m+d}}\leq2^{j'}\sim n^{\frac{1}{2s+d}}\leq2^{j_{1}}\sim (\frac{n}{(\ln n)^{3}} )^{\frac{1}{d}}\). Furthermore, one rewrites
By Lemma 2.2 and \(2^{j'}\sim n^{\frac{1}{2s+d}}\),
On the other hand, it follows from Lemma 2.2 that
When \(p\geq2\),
with \(f\in B_{p,q}^{s}(H)\), Lemma 1.1, Lemma 2.2, and \(t_{n}=\sqrt{\frac {\ln n}{n}}\). When \(1\leq p<2\) and \(s>\frac{d}{p}\), \(B_{p,q}^{s}(\mathbb {R}^{d})\subseteq B_{2,\infty}^{s+d/2-d/p}(\mathbb{R}^{d})\). Then
Hence, this with (28) and (29) shows
Finally, one estimates \(Q_{3}\). Clearly,
This with the choice of \(2^{j'}\) shows
On the other hand, \(Q_{32}:=\sum_{j=j'+1}^{j_{1}}\sum_{\ell =1}^{M}\sum_{k\in\Lambda_{j}} \vert \beta_{j,k}^{\ell} \vert ^{2}I_{\{ |\beta_{j,k}^{\ell}|\leq2\kappa t_{n}\}}\). According to the arguments of (29),
for \(p\geq2\). When \(1\leq p<2\), \(\vert \beta_{j,k}^{\ell} \vert ^{2}I_{\{|\beta_{j,k}^{\ell}|\leq2\kappa t_{n}\}}\leq \vert \beta _{j,k}^{\ell} \vert ^{p} \vert 2\kappa t_{n} \vert ^{2-p}\). Then similar to the arguments of (30),
Combining this with (33) and (32), one knows \(Q_{3}\lesssim (\ln n )n^{-\frac{2s}{2s+d}}\) in both cases. This with (26), (27), and (31) shows
which is the desired conclusion. □
References
Chesneau, C.: Wavelet block thresholding for density estimation in the presence of bias. J. Korean Stat. Soc. 39, 43–53 (2010)
Chesneau, C., Dewan, I., Doosti, H.: Wavelet linear density estimation for associated stratified size-biased sample. J. Nonparametr. Stat. 2, 429–445 (2012)
Davydov, Y.A.: The invariance principle for stationary processes. Theory Probab. Appl. 3, 487–498 (1970)
Donoho, D.L., Johnstone, M.I., Kerkyacharian, G., Picard, D.: Density estimation by wavelet thresholding. Ann. Stat. 24, 508–539 (1996)
Doukhan, P.: Mixing: Properties and Examples. Springer, New York (1994)
Guo, H.J.: Wavelet estimations for a class of regression functions with errors-in-variables. Dissertation, Beijing University of Technology (2016)
Liebscher, E.: Strong convergence of sums of α mixing random variables with applications to density estimation. Stoch. Process. Appl. 65, 69–80 (1996)
Liebscher, E.: Estimation of the density and regression function under mixing conditions. Stat. Decis. 19, 9–26 (2001)
Liu, Y.M., Xu, J.L.: Wavelet density estimation for negatively associated stratified size-biased sample. J. Nonparametr. Stat. 26, 537–554 (2014)
Masry, E.: Wavelet-based estimation of multivariate regression function in Besov spaces. J. Nonparametr. Stat. 12, 283–308 (2000)
Meyer, Y.: Wavelets and Operators. Hermann, Paris (1990)
Mokkadem, A.: Mixing properties of ARMA processes. Stoch. Process. Appl. 29, 309–315 (1988)
Ramírez, P., Vidakovic, B.: Wavelet density estimation for stratified size-biased sample. J. Stat. Plan. Inference 140, 419–432 (2010)
Rio, E.: The functional law of the iterated logarithm for stationary strongly mixing sequences. Ann. Probab. 23, 1188–1203 (1995)
Rosenblatt, M.: A central limit theorem and a strong mixing condition. Proc. Natl. Acad. Sci. USA 42, 43–47 (1970)
Shirazi, E., Doosti, H.: Multivariate wavelet-based density estimation with size-biased data. Stat. Methodol. 27, 12–19 (2015)
Acknowledgements
The authors would like to thank the referees and editor for their important comments and suggestions.
Funding
This paper is supported by the National Natural Science Foundation of China (No. 11771030), Guangxi Natural Science Foundation (No. 2017GXNSFAA198194), and Guangxi Colleges and Universities Key Laboratory of Data Analysis and Computation.
Author information
Authors and Affiliations
Contributions
All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Kou, J., Guo, H. Wavelet density estimation for mixing and size-biased data. J Inequal Appl 2018, 189 (2018). https://doi.org/10.1186/s13660-018-1784-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-018-1784-x
Keywords
- Density estimation
- Strong mixing
- Size-biased
- Wavelets