Skip to main content

Advertisement

Pointwise density estimation based on negatively associated data

Abstract

In this paper, we consider pointwise estimation over \(l^{p}\) \((1\leq p<\infty )\) risk for a density function based on a negatively associated sample. We construct linear and nonlinear wavelet estimators and provide their convergence rates. It turns out that those wavelet estimators have the same convergence rate up to the lnn factor. Moreover, the nonlinear wavelet estimator is adaptive.

Introduction

In practical problems, due to the existence of noise, it is possible to obtain real measurement data only with bias (noise). This paper considers the following density estimation model. Let \(Y_{1}, Y_{2}, \ldots , Y_{n}\) be identically distributed continuous random variables with the density function

$$\begin{aligned} g(y)=\frac{\omega (y)f(y)}{\mu },\quad y\in [0,1]. \end{aligned}$$
(1)

In this equation, ω is a known biasing function, f denotes the unknown density function of unobserved random variable X, and \(\mu :=\mathbb{E}[\omega (X)]<\infty \). The aim of this model is to estimate the unknown density function f by the observed negatively associated data \(Y_{1}, Y_{2},\ldots , Y_{n}\).

This model has many applications in industry [4] and economics [8]. Since wavelet bases have a good local property in both time and frequency domains, the wavelet method has been widely used for density estimation problem. When the observed data \(Y_{1}, Y_{2},\ldots , Y_{n}\) are independent, Ramírez and Vidakovic [13] constructed a linear wavelet estimator and study the \(L^{2}\) consistency of this wavelet estimator. Shirazi and Doosti [15] expanded their work to the multivariate case. Because the definition of linear wavelet estimator depends on the smooth parametric of the density function f, the linear estimator is not adaptive. To overcome this shortage, Chesneau [3] proposed a nonlinear wavelet estimator by hard thresholding method. Moreover, an optimal convergence rate over \(L^{p}\) \((1\leq p<\infty )\) risk is considered. When the independence of data is relaxed to the strong mixing case, Kou and Guo [10] studied the \(L^{2}\) risk of linear and nonlinear wavelet estimators in the Besov space. Note that all those studies all focus on the global error. There is a lack of theoretical results on pointwise wavelet estimation for this density estimation model (1).

In this paper, we establish wavelet estimations on pointwise \(l^{p}\) \((1\leq p<\infty )\) risk for a density function based on a negatively associated sample. Upper bounds of linear and nonlinear wavelet estimators are considered in the Besov space \(B_{r,q}^{s}( \mathbb{R})\). It turns out that the convergence rate of our estimators coincides with the optimal convergence rate for pointwise estimation [2]. Furthermore, our theorem reduces to the corresponding results of Rebelles [14] when \(\omega (y)\equiv 1\) and the sample is independent.

Negative association and wavelets

We first introduce the definition of negative association [1].

Definition 1.1

A sequence of random variables \(Y_{1}, Y _{2}, \ldots , Y_{n}\) is said to be negatively associated if for each pair of disjoint nonempty subsets A and B of \(\{i=1, 2, \ldots , n \}\),

$$ \operatorname{Cov} \bigl(f(X_{i},i\in A), g(X_{j},j\in B) \bigr)\leq 0, $$

where f and g are real-valued coordinatewise nondecreasing functions and the corresponding covariances exist.

It is well known that \(\operatorname{Cov} (Y_{i}, Y_{j} )\equiv 0\) when the random variables are independent. Hence the independent and identically distributed data must be negatively associated. Next, we give an important property of negative association, which will be needed in the later discussion.

Lemma 1.1

([9])

Let \(Y_{1}, Y_{2}, \ldots , Y _{n}\) be a sequence of negatively associated random variables, and let \(A_{1}, A_{2}, \ldots , A_{m}\) be pairwise disjoint nonempty subsets of \(\{i=1, 2, \ldots , n\}\). If \(f_{i}\ (i=1, 2, \ldots , m)\) are m coordinatewise nondecreasing (nonincreasing) functions, then \(f_{1} (Y_{i}, i\in A_{1} ), f_{2} (Y_{i}, i\in A_{2} ), \ldots , f_{m} (Y_{i}, i\in A_{m} )\) are also negatively associated.

To construct our wavelet estimators, we provide the basic theory of wavelets.

Throughout this paper, we work with the wavelet basis described as follows. Let \(\{V_{j}, j\in \mathbb{Z}\}\) be a classical orthonormal multiresolution analysis of \(L^{2}(\mathbb{R})\) with a scaling function φ. Then for each \(f\in L^{2}(\mathbb{R})\),

$$\begin{aligned} f=\sum_{k\in \mathbb{Z}}\alpha _{j_{0},k}\varphi _{j_{0},k}+ \sum_{j=j_{0}}^{\infty }\sum _{k\in \mathbb{Z}}\beta _{j,k} \psi _{j,k}, \end{aligned}$$

where \(\alpha _{j_{0},k}=\langle f, \varphi _{j_{0},k} \rangle \), \(\beta _{j,k}=\langle f, \psi _{j,k} \rangle \) and

$$ \varphi _{j_{0},k}=2^{j_{0}/2}\varphi \bigl(2^{j_{0}}x-k \bigr),\qquad \psi _{j_{,}k}=2^{j/2} \psi \bigl(2^{j}x-k \bigr). $$

Let \(P_{j}\) be the orthogonal projection operator from \(L^{2}( \mathbb{R})\) onto the space \(V_{j}\) with orthonormal basis \(\{ \varphi _{j,k}, k\in \mathbb{Z}\}\). If the scaling function φ satisfies Condition θ, that is,

$$ \sum_{k\in \mathbb{Z}} \bigl\vert \varphi (x-k) \bigr\vert \in L^{\infty }( \mathbb{R}), $$

then it can be shown that for each \(f\in L^{p}(\mathbb{R})\) \((1\leq p<\infty )\),

$$\begin{aligned} P_{j}f=\sum_{k\in \mathbb{Z}} \alpha _{j,k}\varphi _{j,k}. \end{aligned}$$
(2)

On the other hand, a scaling function φ is called m regular if \(\varphi \in C^{m}(\mathbb{R})\) and \(|D^{\delta }\varphi (y)| \leq c(1+y^{2})^{-l}\) for each \(l\in \mathbb{Z}\) \((\delta =0, 1, 2, \ldots , m)\). In this paper, we choose the Daubechies scaling function \(D_{2N}\) [5]. It is easy to see that \(D_{2N}\) satisfies m regular when N gets large enough.

Note that a wavelet basis can characterize a Besov space. These spaces contain many well-known function spaces, such as the Hölder and \(L^{2}\) Sobolev spaces. The following lemma gives equivalent definition of Besov spaces.

Lemma 1.2

([7])

Let \(f\in L^{r}(\mathbb{R})\) \((1\leq r\leq +\infty )\), let the scaling function φ be m-regular, and let \(0< s< m\). Then the following statements are equivalent:

  1. (i)

    \(f\in B^{s}_{r,q}(\mathbb{R}), 1\leq q\leq +\infty \);

  2. (ii)

    \(\{2^{js}\|P_{j}f-f\|_{r}\}\in l_{q}\);

  3. (iii)

    \(\{2^{j(s-\frac{1}{r}+\frac{1}{2})}\|\beta _{j}\|_{r}\}\in l_{q}\).

The Besov norm of f can be defined as

$$ \Vert f \Vert _{B^{s}_{r,q}}:= \bigl\Vert (\alpha _{j_{0}}) \bigr\Vert _{r}+ \bigl\Vert \bigl(2^{j(s- \frac{1}{r}+\frac{1}{2})} \Vert \beta _{j} \Vert _{r}\bigr)_{j\geq j_{0}} \bigr\Vert _{q} $$

with \(\|(\alpha _{j_{0}})\|_{r}^{r}:=\sum_{k\in \mathbb{Z}}| \alpha _{j_{0},k}|^{r}\) and \(\|\beta _{j}\|_{r}^{r}:=\sum_{k\in \mathbb{Z}}|\beta _{j,k}|^{r}\).

In this paper, we assume that the density function f belongs to the Besov ball with radius \(H>0\), that is,

$$ f\in B^{s}_{r,q}(H):=\bigl\{ f\in B^{s}_{r,q}( \mathbb{R}), \Vert f \Vert _{B^{s}_{r,q}} \leq H\bigr\} . $$

Wavelet estimators and theorem

Define our linear wavelet estimator as follows:

$$\begin{aligned} \widehat{f}_{n}^{\mathrm{lin}}(y):=\sum _{k\in \varLambda } \widehat{\alpha }_{j_{0}, k}\varphi _{j_{0}, k}(y) \end{aligned}$$
(3)

with

$$\begin{aligned} \widehat{\alpha }_{j_{0},k}=\frac{\widehat{\mu }_{n}}{n}\sum _{i=1} ^{n}\frac{\varphi _{j_{0}, k}(Y_{i})}{\omega (Y_{i})} \end{aligned}$$
(4)

and

$$\begin{aligned} \widehat{\mu }_{n}= \Biggl[\frac{1}{n} \sum_{i=1}^{n}\frac{1}{ \omega (Y _{i})} \Biggr]^{-1}. \end{aligned}$$
(5)

Using the hard thresholding method, a nonlinear wavelet estimator is defined by

$$\begin{aligned} \widehat{f}_{n}^{\mathrm{non}}(y):=\sum _{k\in \varLambda } \widehat{\alpha }_{j_{0}, k}\varphi _{j_{0}, k}(y)+\sum_{j=j _{0}}^{j_{1}}\sum _{k\in \varLambda _{j}}\widehat{\beta }_{j, k}I _{ \{|\widehat{\beta }_{j, k}|\geq \kappa t_{n} \}}\psi _{j,k}(y), \end{aligned}$$
(6)

where \(t_{n}=\sqrt{\frac{\ln n}{n}}\) and

$$\begin{aligned} \widehat{\beta }_{j,k}=\frac{\widehat{\mu }_{n}}{n}\sum _{i=1}^{n}\frac{ \psi _{j,k}(Y_{i})}{\omega (Y_{i})}. \end{aligned}$$
(7)

In these definitions, \(\varLambda :=\{k\in \mathbb{Z},\operatorname{supp} f \cap \operatorname{supp} \varphi _{j_{0}, k}\neq \varnothing \}\) and \(\varLambda _{j}:=\{k\in \mathbb{Z},\operatorname{supp} f\cap \operatorname{supp} \psi _{j,k}\neq \varnothing \}\). Note that the cardinality of Λ \((\varLambda _{j})\) satisfies \(|\varLambda |\sim 2^{j_{0}}\) \((|\varLambda _{j}|\sim 2^{j} )\) due to the compactly supported properties of the functions f and \(\varphi _{j_{0}, k}\) \((\psi _{j,k})\). Here and further, \(A\sim B\) stands for both \(A\lesssim B\) and \(B\lesssim A\), where \(A\lesssim B\) denotes \(A\leq cB\) with a positive constant c that is independent of A and B. In addition, the constant κ will be chosen in later discussion.

We are in position to state our main theorem.

Theorem 1

Let \(f\in B^{s}_{r,q}(H)\ (r,q\in [1,\infty ), s> \frac{1}{r})\), and let \(\omega (y)\) be a nonincreasing function such that \(\omega (y)\sim 1\). Then for each \(1\leq p<\infty \), the linear wavelet estimator with \(2^{j_{0}}\sim n^{\frac{1}{2(s-1/r)+1}}\) satisfies

$$\begin{aligned} \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-f(y) \bigr\vert \bigr] ^{p}\lesssim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}, \end{aligned}$$
(8)

and the nonlinear wavelet estimator with \(2^{j_{0}}\sim n^{ \frac{1}{2m+1}}(m>s)\) and \(2^{j_{1}}\sim \frac{n}{\ln n}\) satisfies

$$\begin{aligned} \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{non}}(y)-f(y) \bigr\vert \bigr] ^{p}\lesssim (\ln n )^{\frac{3p}{2}}n^{- \frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(9)

Remark 1

Note that \(n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}\) is the optimal convergence rate in the minimax sense for pointwise estimation in a Besov space [2]. Moreover, our theorem reduces to the results of Rebelles [14] when \(\omega (y)\equiv 1\) and the random sample is independent.

Remark 2

In contract to the linear wavelet estimator, the convergence rate of the nonlinear estimator remains the same as that of the linear one up to the lnn factor. However, the nonlinear one is adaptive, which means that both \(j_{0}\) and \(j_{1}\) do not depend on s.

Auxiliary lemmas

In this section, we give some lemmas, which are very useful for proving Theorem 1.

Lemma 2.1

For the model defined by (1), we have

$$ \mathbb{E} \biggl[\frac{1}{\omega (Y_{i})} \biggr]=\frac{1}{\mu },\qquad \mathbb{E} \biggl[\frac{\mu \varphi _{j,k}(Y_{i})}{\omega (Y_{i})} \biggr]= \alpha _{j,k}\quad \textit{and}\quad \mathbb{E} \biggl[\frac{\mu \psi _{j,k}(Y_{i})}{\omega (Y_{i})} \biggr]= \beta _{j,k}. $$

Proof

This lemma can be proved by the same arguments of Kou and Guo [10]. □

Lemma 2.2

Let \(f\in B_{r,q}^{s}\ (1\leq r,q<+\infty , s>1/r)\), and let \(\omega (y)\) be a nonincreasing function such that \(\omega (y) \sim 1\). If \(2^{j}\leq n\) and \(1\leq p<+\infty \), then

$$ \mathbb{E} \bigl[ \vert \widehat{\alpha }_{j_{0},k}-\alpha _{j_{0},k} \vert ^{p} \bigr]\lesssim n^{-\frac{p}{2}},\qquad \mathbb{E} \bigl[ \vert \widehat{\beta } _{j,k}-\beta _{j,k} \vert ^{p} \bigr]\lesssim n^{-\frac{p}{2}}. $$

Proof

Because the proofs of both inequalities are similar, we only prove the second one. By the definition of \(\widehat{\beta }_{j,k}\) we have

$$\begin{aligned} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \leq \Biggl\vert \frac{ \widehat{\mu }_{n}}{\mu } \Biggl(\frac{\mu }{n}\sum _{i=1}^{n} \frac{ \psi _{j,k}(Y_{i})}{\omega (Y_{i})}-\beta _{j,k} \Biggr) \Biggr\vert + \biggl\vert \beta _{j,k} \widehat{\mu }_{n} \biggl(\frac{1}{\mu }-\frac{1}{ \widehat{\mu }_{n}} \biggr) \biggr\vert . \end{aligned}$$

Note that the definition of \(\widehat{\mu }_{n}\) and \(\omega (y) \sim 1\) imply \(|\widehat{\mu }_{n}|\lesssim 1\). We have \(B_{r,q}^{s}( \mathbb{R})\subseteq B_{\infty , \infty }^{s-1/r}(\mathbb{R})\) in the case of \(s>\frac{1}{r}\); then \(f\in B_{\infty , \infty }^{s-1/r}( \mathbb{R})\) and \(\|f\|_{\infty }\lesssim 1\). Moreover, \(|\beta _{j,k}|=| \langle f, \psi _{j,k}\rangle |\lesssim 1\) by the Cauchy–Schwarz inequality and the orthonormality of wavelet functions. Hence, we have the following conclusion:

$$\begin{aligned} \mathbb{E} \bigl[ \vert \widehat{\beta }_{j,k}- \beta _{j,k} \vert ^{p} \bigr] \lesssim \mathbb{E} \Biggl[ \Biggl\vert \frac{1}{n}\sum_{i=1}^{n} \frac{ \mu \psi _{j,k}(Y_{i})}{\omega (Y_{i})}-\beta _{j,k} \Biggr\vert ^{p} \Biggr]+ \mathbb{E} \biggl[ \biggl\vert \frac{1}{\mu }- \frac{1}{\widehat{\mu }_{n}} \biggr\vert ^{p} \biggr]. \end{aligned}$$
(10)

Then we need to estimate \(T_{1}:=\mathbb{E} [ \vert \frac{1}{n} \sum_{i=1}^{n} \frac{\mu \psi _{j,k}(Y_{i})}{\omega (Y_{i})}- \beta _{j,k} \vert ^{p} ]\) and \(T_{2}:=\mathbb{E} [ \vert \frac{1}{ \mu }-\frac{1}{\widehat{\mu }_{n}} \vert ^{p} ]\).

• An upper bound for \(T_{1}\). Taking \(\eta _{i}:=\frac{\mu \psi _{j,k}(Y_{i})}{\omega (Y_{i})}-\beta _{j,k}\), we get

$$ T_{1}=\mathbb{E} \Biggl[ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n}\eta _{i} \Biggr\vert ^{p} \Biggr]= \biggl(\frac{1}{n} \biggr)^{p}\mathbb{E} \Biggl[ \Biggl\vert \sum _{i=1}^{n}\eta _{i} \Biggr\vert ^{p} \Biggr]. $$

Note that ψ is a function of bounded variation (see Liu and Xu [12]). We can get \(\psi :=\widetilde{\psi }-\overline{ \psi }\), where ψ̃ and ψ̅ bounded nonnegative nondecreasing functions. Define

$$ \widetilde{\eta }_{i}:=\frac{\mu \widetilde{\psi }_{j,k}(Y_{i})}{ \omega (Y_{i})}-\widetilde{\beta }_{j,k}, \qquad \overline{\eta }_{i}:=\frac{\mu \overline{\psi }_{j,k}(Y_{i})}{\omega (Y_{i})}- \overline{\beta }_{j,k} $$

with \(\widetilde{\beta }_{j,k}:=\langle f, \widetilde{\psi }_{j,k} \rangle \) and \(\overline{\beta }_{j,k}:=\langle f, \overline{\psi } _{j,k}\rangle \). Then \(\eta _{i}=\widetilde{\eta }_{i}-\overline{ \eta }_{i}\), \(\beta _{j,k}=\widetilde{\beta }_{j,k}-\overline{\beta } _{j,k}\), and

$$\begin{aligned} T_{1}= \biggl(\frac{1}{n} \biggr)^{p}\mathbb{E} \Biggl[ \Biggl\vert \sum _{i=1}^{n} (\widetilde{\eta }_{i}- \overline{\eta }_{i} ) \Biggr\vert ^{p} \Biggr]\lesssim \biggl(\frac{1}{n} \biggr)^{p} \Biggl\{ \mathbb{E} \Biggl[ \Biggl\vert \sum_{i=1}^{n}\widetilde{ \eta }_{i} \Biggr\vert ^{p} \Biggr]+ \mathbb{E} \Biggl[ \Biggl\vert \sum_{i=1}^{n}\overline{ \eta }_{i} \Biggr\vert ^{p} \Biggr] \Biggr\} . \end{aligned}$$
(11)

Similar arguments as in Lemma 2.1 show that \(\mathbb{E}[ \widetilde{\eta }_{i}]=0\). The function \(\frac{\widetilde{\psi }_{j,k}(y)}{ \omega (y)}\) is nondecreasing by the monotonicity of \(\widetilde{\psi }(y)\) and \(\omega (y)\). Furthermore, we get that \(\{\widetilde{\eta }_{i}, i=1, 2, \ldots , n\}\) is negatively associated by Lemma 1.1. On the other hand, it follows from (1) and \(\omega (y)\sim 1\) that

$$\begin{aligned} \mathbb{E} \bigl[ \vert \widetilde{\eta }_{i} \vert ^{p} \bigr]\lesssim \mathbb{E} \biggl[ \biggl\vert \frac{\mu \widetilde{\psi }_{j,k}(Y_{i})}{ \omega (Y_{i})} \biggr\vert ^{p} \biggr]\lesssim \int _{[0,1]} \bigl\vert \widetilde{\psi }_{j,k}(y) \bigr\vert ^{p}f(y)\,dy\lesssim 2^{j(p/2-1)}. \end{aligned}$$
(12)

In particular, \(\mathbb{E} [|\widetilde{\eta }_{i}|^{2} ] \lesssim 1\). Recall Rosenthal’s inequality [12]: if \(Y_{1}, Y_{2}, \ldots , Y_{n}\) are negatively associated random variables such that \(\mathbb{E}[Y_{i}]=0\) and \(\mathbb{E}[|Y_{i}|^{p}]< \infty \), then

$$ \mathbb{E} \Biggl[ \Biggl\vert \sum_{i=1}^{n}Y_{i} \Biggr\vert ^{p} \Biggr] \lesssim \textstyle\begin{cases} \sum_{i=1}^{n}\mathbb{E} [ \vert Y_{i} \vert ^{p} ]+ (\sum_{i=1}^{n}\mathbb{E} [ \vert Y_{i} \vert ^{2} ] )^{{p}/{2}}, & \text{$p>2$;} \\ (\sum_{i=1}^{n}\mathbb{E} [ \vert Y_{i} \vert ^{2} ] ) ^{{p}/{2}}, & \text{$1\leq p\leq 2$.} \end{cases} $$

From this we clearly have

$$ \mathbb{E} \Biggl[ \Biggl\vert \sum_{i=1}^{n} \widetilde{\eta }_{i} \Biggr\vert ^{p} \Biggr]\lesssim \textstyle\begin{cases} n2^{j(p/2-1)}+n^{p/2}, & \text{$p>2$;} \\ n^{p/2}, & \text{$1\leq p\leq 2$.} \end{cases} $$

This, together with \(2^{j}\leq n\), shows that \(\mathbb{E} [ \vert \sum_{i=1}^{n}\widetilde{\eta }_{i} \vert ^{p} ]\lesssim n ^{p/2}\). Similarly, \(\mathbb{E} [ \vert \sum_{i=1}^{n}\overline{ \eta }_{i} \vert ^{p} ]\lesssim n^{p/2}\). Combining these with (11), we get that

$$\begin{aligned} T_{1}\lesssim \biggl(\frac{1}{n} \biggr)^{p} \Biggl\{ \mathbb{E} \Biggl[ \Biggl\vert \sum _{i=1}^{n}\widetilde{\eta }_{i} \Biggr\vert ^{p} \Biggr]+\mathbb{E} \Biggl[ \Biggl\vert \sum _{i=1}^{n}\overline{\eta }_{i} \Biggr\vert ^{p} \Biggr] \Biggr\} \lesssim n^{-\frac{p}{2}}. \end{aligned}$$
(13)

• An upper bound for \(T_{2}\). It is easy to see from the definition of \(\widehat{\mu }_{n}\) that

$$\begin{aligned} T_{2}=\mathbb{E} \biggl[ \biggl\vert \frac{1}{\mu }-\frac{1}{\widehat{\mu } _{n}} \biggr\vert ^{p} \biggr]= \biggl(\frac{1}{n} \biggr)^{p}\mathbb{E} \Biggl[ \Biggl\vert \sum_{i=1}^{n} \biggl( \frac{1}{\omega (Y_{i})}-\frac{1}{ \mu } \biggr) \Biggr\vert ^{p} \Biggr]. \end{aligned}$$
(14)

Defining \(\xi _{i}:=\frac{1}{\omega (Y_{i})}-\frac{1}{\mu }\), we obtain that \(\mathbb{E}[\xi _{i}]=0\) and \(\mathbb{E}[|\xi _{i}|^{p}]\lesssim 1\) by Lemma 2.1 and \(\omega (y)\sim 1\). In addition, by the monotonicity of \(\omega (y)\) and Lemma 1.1 we know that \(\xi _{1}, \xi _{2}, \ldots , \xi _{n}\) are also negatively associated. Then using Rosenthal’s inequality, we get

$$ \mathbb{E} \Biggl[ \Biggl\vert \sum_{i=1}^{n} \xi _{i} \Biggr\vert ^{p} \Biggr] \lesssim \textstyle\begin{cases} n+n^{p/2}, & \text{$p>2$;} \\ n^{p/2}, & \text{$1\leq p\leq 2$.} \end{cases} $$

Hence

$$\begin{aligned} T_{2}= \biggl(\frac{1}{n} \biggr)^{p}\mathbb{E} \Biggl[ \Biggl\vert \sum _{i=1}^{n}\xi _{i} \Biggr\vert ^{p} \Biggr] \lesssim n^{-\frac{p}{2}}. \end{aligned}$$
(15)

Finally, by (10), (13), and (15) we have

$$ \mathbb{E} \bigl[ \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert ^{p} \bigr]\lesssim n^{-\frac{p}{2}}. $$

This ends the proof. □

Lemma 2.3

Let \(f\in B_{r,q}^{s}\) \((1\leq r,q<+\infty , s>1/r)\) and \(\widehat{\beta }_{j,k}\) be defined by (7). If \(\omega (y)\) is a nonincreasing function, \(\omega (y)\sim 1\), and \(2^{j}\leq \frac{n}{ \ln n}\), then for each \(\lambda >0\), there exists a constant \(\kappa >1\) such that

$$ \mathbb{P} \bigl\{ \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \geq \kappa t_{n} \bigr\} \lesssim 2^{-\lambda j}. $$

Proof

By the same arguments of (10) we can obtain that

$$\begin{aligned} \mathbb{P} \bigl\{ \vert \widehat{\beta }_{j,k}- \beta _{j,k} \vert \geq \kappa t_{n} \bigr\} \leq{}& \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n} \sum _{i=1}^{n} \biggl(\frac{1}{\omega (Y_{i})}- \frac{1}{\mu } \biggr) \Biggr\vert \geq \frac{\kappa t_{n}}{2} \Biggr\} \\ &{}+\mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n} \biggl(\frac{ \mu \psi _{j,k}(Y_{i})}{\omega (Y_{i})}-\beta _{j,k} \biggr) \Biggr\vert \geq \frac{\kappa t_{n}}{2} \Biggr\} . \end{aligned}$$
(16)

To estimate \(\mathbb{P} \{ \vert \frac{1}{n}\sum_{i=1} ^{n} (\frac{1}{\omega (Y_{i})}-\frac{1}{\mu } ) \vert \geq \frac{\kappa t_{n}}{2} \}\), we also define \(\xi _{i}:=\frac{1}{ \omega (Y_{i})}-\frac{1}{\mu }\). Then Lemma 2.1 implies that \(\mathbb{E}[\xi _{i}]=0\). Moreover, \(|\xi _{i}|\lesssim 1\) and \(\mathbb{E}[|\xi _{i}|^{2}]\lesssim 1\) thanks to \(\omega (y)\sim 1\). On the other hand, because of the monotonicity of \(\omega (y)\) and Lemma 1.1, \(\xi _{1}, \xi _{2}, \ldots , \xi _{n}\) are also negatively associated.

Recall Bernstein’s inequality [12]: If \(Y_{1}, Y_{2}, \ldots , Y_{n}\) are negatively associated random variables such that \(\mathbb{E}[Y_{i}]=0\), \(|Y_{i}|\leq M<\infty \), and \(\mathbb{E}[|Y _{i}|^{2}]=\sigma ^{2}\), then for each \(\varepsilon >0\),

$$ \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n}Y_{i} \Biggr\vert \geq \varepsilon \Biggr\} \lesssim \exp \biggl(-\frac{n\varepsilon ^{2}}{2(\sigma ^{2}+\varepsilon M/3)} \biggr). $$

Therefore, by the previous arguments for \(\xi _{i}\) and \(t_{n}=\sqrt{\frac{ \ln n}{n}}\), we derive

$$ \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n} \biggl(\frac{1}{ \omega (Y_{i})}- \frac{1}{\mu } \biggr) \Biggr\vert \geq \frac{\kappa t_{n}}{2} \Biggr\} \lesssim \exp \biggl(-\frac{(\ln n) \kappa ^{2}/4}{2(\sigma ^{2}+\kappa /6)} \biggr). $$

Then there exists \(\kappa >1\) such that \(\exp (-\frac{(\ln n) \kappa ^{2}/4}{2(\sigma ^{2}+\kappa /6)} )\lesssim 2^{-\lambda j}\) with fixed \(\lambda >0\). Hence

$$\begin{aligned} \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n} \biggl(\frac{1}{ \omega (Y_{i})}- \frac{1}{\mu } \biggr) \Biggr\vert \geq \frac{\kappa t_{n}}{2} \Biggr\} \lesssim 2^{-\lambda j}. \end{aligned}$$
(17)

Next, we estimate \(\mathbb{P} \{ \vert \frac{1}{n}\sum_{i=1}^{n} (\frac{\mu \psi _{j,k}(Y_{i})}{\omega (Y_{i})}-\beta _{j,k} ) \vert \geq \frac{\kappa t_{n}}{2} \}\). By to the same arguments of (11) we get

$$\begin{aligned} \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n}\eta _{i} \Biggr\vert \geq \frac{\kappa t_{n}}{2} \Biggr\} \leq \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n} \sum_{i=1}^{n} \widetilde{\eta }_{i} \Biggr\vert \geq \frac{\kappa t _{n}}{4} \Biggr\} +\mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1} ^{n}\overline{\eta }_{i} \Biggr\vert \geq \frac{\kappa t_{n}}{4} \Biggr\} . \end{aligned}$$
(18)

It is easy to see from the definition of \(\widetilde{\eta }_{i}\) and Lemma 2.1 that \(\mathbb{E}[\widetilde{\eta }_{i}]=0\). Moreover, \(\mathbb{E}[|\widetilde{\eta }_{i}|^{2}]\lesssim 1\) by (12) with \(p=2\). Using \(\omega (y)\sim 1\), we get \(\vert \frac{\mu \widetilde{\psi }_{j,k}(Y_{i})}{\omega (Y_{i})} \vert \lesssim 2^{j/2}\) and \(|\widetilde{\eta }_{i}|\leq \vert \frac{\mu \widetilde{\psi } _{j,k}(Y_{i})}{\omega (Y_{i})} \vert +\mathbb{E} [ \vert \frac{ \mu \widetilde{\psi }_{j,k}(Y_{i})}{\omega (Y_{i})} \vert ] \lesssim 2^{j/2}\). Then it follows from Bernstein’s inequality, \(2^{j}\leq \frac{n}{\ln n}\), and \(t_{n}=\sqrt{\frac{\ln n}{n}}\) that

$$ \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n} \widetilde{\eta }_{i} \Biggr\vert \geq \frac{\kappa t_{n}}{4} \Biggr\} \lesssim \exp \biggl(- \frac{n(\kappa t_{n}/4)^{2}}{2(\sigma ^{2}+ \kappa t_{n}2^{j/2}/12)} \biggr)\lesssim \exp \biggl(-\frac{(\ln n) \kappa ^{2}/16}{2(\sigma ^{2}+\kappa /12)} \biggr). $$

Clearly, we can take \(\kappa >1\) such that \(\mathbb{P} \{ \vert \frac{1}{n} \sum_{i=1}^{n}\widetilde{\eta }_{i} \vert \geq \frac{\kappa t _{n}}{4} \} \lesssim 2^{-\lambda j}\). Then similar arguments show that \(\mathbb{P} \{ \vert \frac{1}{n}\sum_{i=1}^{n}\overline{ \eta }_{i} \vert \geq \frac{\kappa t_{n}}{4} \} \lesssim 2^{- \lambda j}\). Combining those with (18), we obtain

$$\begin{aligned} \mathbb{P} \Biggl\{ \Biggl\vert \frac{1}{n}\sum _{i=1}^{n} \biggl(\frac{ \mu \psi _{j,k}(Y_{i})}{\omega (Y_{i})}- \beta _{j,k} \biggr) \Biggr\vert \geq \frac{\kappa t_{n}}{2} \Biggr\} \lesssim 2^{-\lambda j}. \end{aligned}$$
(19)

By (16), (17), and (19) we get

$$ \mathbb{P} \bigl\{ \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \geq \kappa t_{n} \bigr\} \lesssim 2^{-\lambda j}. $$

This ends the proof. □

Proof of theorem

In this section, we prove the Theorem 1.

Proof of (8)

It is easy to see that

$$\begin{aligned} \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-f(y) \bigr\vert ^{p} \bigr] \lesssim \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-P _{j_{0}}f(y) \bigr\vert ^{p} \bigr]+ \bigl\vert P_{j_{0}}f(y)-f(y) \bigr\vert ^{p}. \end{aligned}$$
(20)

Then we need to estimate \(\mathbb{E} [ |\widehat{f}_{n}^{ \mathrm{lin}}(y)-P_{j_{0}}f(y) |^{p} ]\) and \(|P_{j_{0}}f(y)-f(y) |^{p}\).

By (2) and (3) we get that

$$ \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-P_{j_{0}}f(y) \bigr\vert ^{p} \bigr]=\mathbb{E} \biggl[ \biggl\vert \sum _{k\in \varLambda } (\widehat{\alpha }_{j_{0}, k}-\alpha _{j_{0}, k} )\varphi _{j _{0}, k}(y) \biggr\vert ^{p} \biggr]. $$

Using the Hölder inequality \((1/p+1/p'=1)\), we see that

$$ \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-P_{j_{0}}f(y) \bigr\vert ^{p} \bigr]\leq \mathbb{E} \biggl[ \biggl(\sum _{k\in \varLambda } \vert \widehat{\alpha }_{j_{0}, k}-\alpha _{j_{0}, k} \vert ^{p} \bigl\vert \varphi _{j_{0}, k}(y) \bigr\vert \biggr) \biggl(\sum _{k\in \varLambda } \bigl\vert \varphi _{j_{0}, k}(y) \bigr\vert \biggr)^{\frac{p}{p'}} \biggr]. $$

Then it follows from Condition θ and Lemma 2.2 that

$$\begin{aligned} \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-P_{j_{0}}f(y) \bigr\vert ^{p} \bigr] \lesssim \sum_{k\in \varLambda } \mathbb{E} \bigl[ \vert \widehat{\alpha }_{j_{0}, k}-\alpha _{j_{0}, k} \vert ^{p} \bigr] \bigl\vert \varphi _{j_{0}, k}(y) \bigr\vert 2^{\frac{j_{0}p}{2p'}}\lesssim \biggl( \frac{2^{j _{0}}}{n} \biggr)^{\frac{p}{2}}. \end{aligned}$$
(21)

This, together with \(2^{j_{0}}\sim n^{\frac{1}{2(s-1/r)+1}}\), shows that

$$\begin{aligned} \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-P_{j_{0}}f(y) \bigr\vert ^{p} \bigr] \lesssim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(22)

Note that \(B_{r, q}^{s}(\mathbb{R})\subseteq B_{\infty , \infty }^{s-1/r}( \mathbb{R})\) in the case \(s>1/r\). It should be pointed out that \(B_{\infty , \infty }^{s-1/r}(\mathbb{R})\) is also a Hölder space. Then by Lemma 1.2, \(f\in B_{r, q}^{s}(\mathbb{R})\), and \(2^{j_{0}} \sim n^{\frac{1}{2(s-1/r)+1}}\) we obtain that

$$\begin{aligned} \bigl\vert P_{j_{0}}f(y)-f(y) \bigr\vert ^{p}\lesssim 2^{-j_{0}(s-1/r)p} \lesssim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(23)

Combining this with (20) and (22), we get

$$ \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{lin}}(y)-f(y) \bigr\vert ^{p} \bigr] \lesssim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. $$

 □

Proof of (9)

Using the definitions of \(\widehat{f} _{n}^{\mathrm{lin}}(y)\) and \(\widehat{f}_{n}^{\mathrm{non}}(y)\), we get that

$$\begin{aligned} \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{non}}(y)-f(y) \bigr\vert ^{p} \bigr] \lesssim W_{1}+W_{2}+G, \end{aligned}$$
(24)

where \(W_{1}:=\mathbb{E} [ |\widehat{f}_{n}^{\mathrm{lin}}(y)-P _{j_{0}}f(y) |^{p} ]\), \(W_{2}:= |P_{j_{1}+1}f(y)-f(y) |^{p}\), and

$$ G:=\mathbb{E} \Biggl[ \Biggl\vert \sum_{j=j_{0}}^{j_{1}} \sum_{k\in \varLambda _{j}} (\widehat{\beta }_{j,k}I_{\{ \vert \widehat{\beta }_{j,k} \vert \geq \kappa t_{n}\}}- \beta _{j,k} )\psi _{j,k}(y) \Biggr\vert ^{p} \Biggr]. $$

It follows from (21), \(2^{j_{0}}\sim n^{\frac{1}{2m+1}}\) \((m>s)\), and \(s>1/r\) that

$$\begin{aligned} W_{1}\lesssim \biggl(\frac{2^{j_{0}}}{n} \biggr)^{\frac{p}{2}}\sim n ^{-\frac{mp}{2m+1}}< n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(25)

On the other hand, by the same arguments as for (23), we can obtain that \(W_{2}\lesssim 2^{-j_{1}(s-1/r)p}\). This with the choice of \(2^{j_{1}}\sim \frac{n}{\ln n}\) shows

$$\begin{aligned} W_{2}\lesssim 2^{-j_{1}(s-1/r)p}\sim \biggl( \frac{\ln n}{n} \biggr) ^{(s-1/r)p}< \biggl(\frac{\ln n}{n} \biggr)^{ \frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(26)

Then the remaining task is to estimate G.

Using the classical technique in [6], we get that

$$\begin{aligned} G\lesssim (\ln n)^{p-1}(G_{1}+G_{2}+G_{3}), \end{aligned}$$
(27)

where

$$\begin{aligned} &G_{1}:=\mathbb{E} \Biggl[\sum_{j=j_{0}}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert I _{\{ \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \Biggr], \\ &G_{2}:=\mathbb{E} \Biggl[\sum_{j=j_{0}}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert I _{\{ \vert \beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \Biggr], \\ &G_{3}:=\sum_{j=j_{0}}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} \vert \beta _{j,k} \vert I_{\{ \vert \beta _{j,k} \vert \leq 2\kappa t_{n}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p}. \end{aligned}$$

• An upper bound for \(G_{1}\). By the definition of \(\widehat{\beta }_{j,k}\), \(\omega (y)\sim 1\), and Lemma 2.1, \(|\widehat{\beta }_{j,k}|\lesssim 2^{j/2}\) and \(|\widehat{\beta }_{j,k}- \beta _{j,k}|\lesssim 2^{j/2}\). Furthermore, we obtain that

$$ G_{1}\lesssim \mathbb{E} \Biggl[\sum_{j=j_{0}}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} 2^{j/2}I_{\{ \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr) ^{p} \Biggr]. $$

On the other hand, it follows from the Hölder inequality and Condition θ that

$$\begin{aligned} &\biggl(\sum_{k\in \varLambda _{j}} I_{\{ \vert \widehat{\beta }_{j,k}- \beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr) ^{p} \\ &\quad \lesssim \biggl( \sum_{k\in \varLambda _{j}} I_{\{ \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr) \biggl(\sum _{k\in \varLambda _{j}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{\frac{p}{p'}} \\ &\quad \lesssim \biggl(\sum_{k\in \varLambda _{j}} I_{\{ \vert \widehat{\beta } _{j,k}-\beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)2^{\frac{jp}{2p'}}. \end{aligned}$$

Then using Condition θ and Lemma 2.3, we derive that

$$\begin{aligned} G_{1} &\lesssim \mathbb{E} \Biggl[\sum _{j=j_{0}}^{j_{1}}2^{jp/2} \biggl(\sum _{k\in \varLambda _{j}} I_{\{ \vert \widehat{\beta }_{j,k}- \beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)2^{ \frac{jp}{2p'}} \Biggr] \\ &\lesssim \sum_{j=j_{0}}^{j_{1}}2^{\frac{j}{2}(p+\frac{p}{p'})} \sum_{k\in \varLambda _{j}} \bigl\vert \psi _{j,k}(y) \bigr\vert \mathbb{E} [I _{\{ \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} ] \lesssim \sum _{j=j_{0}}^{j_{1}}2^{j(p-\lambda )}. \end{aligned}$$
(28)

Clearly, there exists \(\kappa >1\) such that \(\lambda >p+mp\) in Lemma 2.3. Then \(G_{1}\lesssim \sum_{j=j_{0}}^{j_{1}}2^{j(p-\lambda )}\lesssim \sum_{j=j_{0}}^{j_{1}}2^{-jmp}\). This with the choice of \(2^{j_{0}}\sim n^{\frac{1}{2m+1}}\) \((m>s)\) shows that

$$\begin{aligned} G_{1}\lesssim \sum_{j=j_{0}}^{j_{1}}2^{-jmp} \lesssim 2^{-j_{0}mp} \sim n^{-\frac{mp}{2m+1}}< n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(29)

• An upper bound for \(G_{2}\). Taking \(2^{j_{*}}\sim n^{ \frac{1}{2(s-1/r)+1}}\), we get that \(2^{j_{0}}<2^{j_{*}}<2^{j_{1}}\). It is easy to see that

$$\begin{aligned} G_{21} &:=\mathbb{E} \Biggl[\sum_{j=j_{0}}^{j_{*}} \biggl(\sum_{k\in \varLambda _{j}} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert I _{\{ \vert \beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \Biggr] \\ &\lesssim \sum_{j=j_{0}}^{j_{*}}\mathbb{E} \biggl[ \biggl(\sum_{k\in \varLambda _{j}} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \biggr]. \end{aligned}$$

Similarly to the arguments of (21), we get

$$\begin{aligned} G_{21}\lesssim \sum_{j=j_{0}}^{j_{*}} \biggl(\frac{2^{j}}{n} \biggr) ^{\frac{p}{2}} \lesssim \biggl( \frac{2^{j_{*}}}{n} \biggr)^{ \frac{p}{2}}\sim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}} \end{aligned}$$
(30)

by Lemma 2.2 and \(2^{j_{*}}\sim n^{\frac{1}{2(s-1/r)+1}}\).

On the other hand,

$$\begin{aligned} G_{22} &:=\mathbb{E} \Biggl[\sum_{j=j_{*}+1}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert I _{\{ \vert \beta _{j,k} \vert \geq \frac{\kappa t_{n}}{2}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \Biggr] \\ &\lesssim \mathbb{E} \Biggl[\sum_{j=j_{*}+1}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert \biggl\vert \frac{\beta _{j,k}}{\kappa t_{n}} \biggr\vert \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \Biggr]. \end{aligned}$$

Using the Hölder inequality and Lemma 2.2, we have

$$\begin{aligned} G_{22} &\lesssim \mathbb{E} \Biggl[\sum _{j=j_{*}+1}^{j_{1}} \biggl(\frac{1}{t_{n}} \biggr)^{p} \biggl(\sum_{k\in \varLambda _{j}} \vert \widehat{\beta }_{j,k}-\beta _{j,k} \vert ^{p} \bigl\vert \beta _{j,k}\psi _{j,k}(y) \bigr\vert \biggr) \biggl(\sum_{k\in \varLambda _{j}} \bigl\vert \beta _{j,k}\psi _{j,k}(y) \bigr\vert \biggr) ^{\frac{p}{p'}} \Biggr] \\ &\lesssim \sum_{j=j_{*}+1}^{j_{1}} \biggl( \frac{1}{t_{n}} \biggr) ^{p}n^{-\frac{p}{2}} \biggl(\sum _{k\in \varLambda _{j}} \bigl\vert \beta _{j,k}\psi _{j,k}(y) \bigr\vert \biggr)^{p}. \end{aligned}$$

When \(s>1/r\), \(B_{r,q}^{s}(\mathbb{R})\subseteq B_{\infty , \infty } ^{s-1/r}(\mathbb{R})\). Clearly, \(B_{\infty , \infty }^{s-1/r}( \mathbb{R})\) is a Hölder space. Then we can derive that \(\sum_{k\in \varLambda _{j}} |\beta _{j,k}\psi _{j,k}(y) | \lesssim 2^{-j(s-1/r)}\) as in [11]. Hence it follows from the choice of \(2^{j_{*}}\) that

$$\begin{aligned} G_{22}\lesssim \sum_{j=j_{*}+1}^{j_{1}}2^{-j(s-1/r)p} \lesssim 2^{-j_{*}(s-1/r)p}\sim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(31)

Therefore we have

$$\begin{aligned} G_{2}=G_{21}+G_{22} \lesssim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(32)

• An upper bound for \(G_{3}\). Clearly, we can obtain that

$$\begin{aligned} G_{31} &:=\sum_{j=j_{0}}^{j_{*}} \biggl(\sum_{k\in \varLambda _{j}} \vert \beta _{j,k} \vert I_{\{ \vert \beta _{j,k} \vert \leq 2\kappa t_{n}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p}\lesssim \sum_{j=j_{0}}^{j_{*}} \biggl(\sum_{k\in \varLambda _{j}} t _{n} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \\ &\lesssim \sum_{j=j_{0}}^{j_{*}} \biggl( \frac{\ln n}{n} \biggr) ^{\frac{p}{2}}2^{jp/2}\lesssim \biggl( \frac{\ln n}{n} \biggr)^{ \frac{p}{2}}2^{j_{*}p/2}\lesssim (\ln n)^{p/2}n^{- \frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(33)

In addition, it follows from the Hölder inequality \((1/r+1/r'=1)\), Condition θ, and Lemma 1.2 that

$$\begin{aligned} G_{32} &:=\sum_{j=j_{*}+1}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} \vert \beta _{j,k} \vert I_{\{ \vert \beta _{j,k} \vert \leq 2\kappa t_{n}\}} \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p}\lesssim \sum_{j=j_{*}+1}^{j_{1}} \biggl(\sum_{k\in \varLambda _{j}} \vert \beta _{j,k} \vert \bigl\vert \psi _{j,k}(y) \bigr\vert \biggr)^{p} \\ &\lesssim \sum_{j=j_{*}+1}^{j_{1}} \biggl(\sum _{k\in \varLambda _{j}} \vert \beta _{j,k} \vert ^{r} \biggr)^{ \frac{p}{r}} \biggl(\sum _{k\in \varLambda _{j}} \bigl\vert \psi _{j,k}(y) \bigr\vert ^{r'} \biggr)^{\frac{p}{r'}}\lesssim \sum _{j=j_{*}+1}^{j _{1}}2^{-j(s-1/r)p}. \end{aligned}$$

This with \(2^{j_{*}}\sim n^{\frac{1}{2(s-1/r)+1}}\) shows that

$$\begin{aligned} G_{32}\lesssim \sum_{j=j_{*}+1}^{j_{1}}2^{-j(s-1/r)p} \lesssim 2^{-j_{*}(s-1/r)p}\sim n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(34)

Therefore

$$\begin{aligned} G_{3}=G_{31}+G_{32} \lesssim (\ln n)^{p/2}n^{- \frac{(s-1/r)p}{2(s-1/r)+1}}. \end{aligned}$$
(35)

By (27), (29), (32), and (35) we have \(G\lesssim (\ln n)^{\frac{3p}{2}}n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}\). Then it is easy to see from (24), (25), and (26) that

$$ \mathbb{E} \bigl[ \bigl\vert \widehat{f}_{n}^{\mathrm{non}}(y)-f(y) \bigr\vert ^{p} \bigr] \lesssim (\ln n)^{\frac{3p}{2}}n^{-\frac{(s-1/r)p}{2(s-1/r)+1}}. $$

This ends the proof. □

References

  1. 1.

    Alam, K., Saxena, K.M.L.: Positive dependence in multivariate distributions. Commun. Stat., Theory Methods 10, 1183–1196 (1981)

  2. 2.

    Cai, T.T.: Rates of convergence and adaptation over Besov space under pointwise risk. Stat. Sin. 13, 881–902 (2003)

  3. 3.

    Chesneau, C.: Wavelet block thresholding for density estimation in the presence of bias. J. Korean Stat. Soc. 39, 43–53 (2010)

  4. 4.

    Cox, D.R.: Some Sampling Problems in Technology. Wiley, New York (1969)

  5. 5.

    Daubechies, I.: Ten Lectures on Wavelets. SIAM, Philadelphia (1992)

  6. 6.

    Donoho, D.L., Johnstone, I.M., Keryacharian, G., Picard, D.: Density estimation by wavelet thresholding. Ann. Stat. 2, 508–539 (1996)

  7. 7.

    Härdle, W., Kerkyacharian, G., Picard, D., Tsybakov, A.: Wavelets, Approximation and Statistical Application. Springer, New York (1997)

  8. 8.

    Heckman, J.: Selection Bias and Self-Selection. Macmillan Press, New York (1985)

  9. 9.

    Joag-Dev, K., Proschan, F.: Negative association of random variables with applications. Ann. Stat. 11, 286–295 (1983)

  10. 10.

    Kou, J.K., Guo, H.J.: Wavelet density estimation for mixing and size-biased data. J. Inequal. Appl. 2018, 189 (2018)

  11. 11.

    Liu, Y.M., Wu, C.: Point-wise estimation for anisotropic densities. J. Multivar. Anal. 17, 112–125 (2019)

  12. 12.

    Liu, Y.M., Xu, J.L.: Wavelet density estimation for negatively associated stratified size-biased sample. J. Nonparametr. Stat. 26, 537–554 (2014)

  13. 13.

    Ramírez, P., Vidakovic, B.: Wavelet density estimation for stratified size-biased sample. J. Stat. Plan. Inference 140, 419–432 (2010)

  14. 14.

    Rebelles, G.: Pointwise adaptive estimation of a multivariate density under independence hypothesis. Bernoulli 21, 1984–2023 (2015)

  15. 15.

    Shirazi, E., Doosti, H.: Multivariate wavelet-based density estimation with size-biased data. Stat. Methodol. 27, 12–19 (2015)

Download references

Acknowledgements

The authors would like to thank the referees and editor for their important comments and suggestions.

Funding

This paper is supported by Guangxi Natural Science Foundation (No. 2018GXNSFBA281076), Guangxi Science and Technology Project (Nos. Guike AD18281058 and Guike AD18281019), the Guangxi Young Teachers Basic Ability Improvement Project (Nos. 2018KY0212 and 2019KY0218), and Guangxi Colleges and Universities Key Laboratory of Data Analysis and Computation.

Author information

Both authors contributed equally to the writing of this paper. Both authors read and approved the final manuscript.

Correspondence to Junke Kou.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Keywords

  • Pointwise density estimation
  • \(l^{p}\) risk
  • Negatively associated
  • Wavelets