# Wavelet estimations for densities and their derivatives with Fourier oscillating noises

## Abstract

By developing the classical kernel method, Delaigle and Meister provide a nice estimation for a density function with some Fourier-oscillating noises over a Sobolev ball ${W}_{2}^{s}\left(L\right)$ and over ${L}^{2}$ risk (Delaigle and Meister in Stat. Sin. 21:1065-1092, 2011). The current paper extends their theorem to Besov ball ${B}_{r,q}^{s}\left(L\right)$ and ${L}^{p}$ risk with $p,q,r\in \left[1,\mathrm{\infty }\right]$ by using wavelet methods. We firstly show a linear wavelet estimation for densities in ${B}_{r,q}^{s}\left(L\right)$ over ${L}^{p}$ risk, motivated by the work of Delaigle and Meister. Our result reduces to their theorem, when $p=q=r=2$. Because the linear wavelet estimator is not adaptive, a nonlinear wavelet estimator is then provided. It turns out that the convergence rate is better than the linear one for $r\le p$. In addition, our conclusions contain estimations for density derivatives as well.

## 1 Introduction and preliminary

One of the fundamental deconvolution problems is to estimate a density function ${f}_{X}$ of a random variable X, when the available data ${W}_{1},{W}_{2},\dots ,{W}_{n}$ are independent and identically distributed (i.i.d.) with

${W}_{j}={X}_{j}+{\delta }_{j}\phantom{\rule{1em}{0ex}}\left(j=1,2,\dots ,n\right).$

We assume that all ${X}_{j}$ and ${\delta }_{j}$ are independent and the density function ${f}_{\delta }$ of the noise δ is known.

Let the Fourier transform ${f}^{ft}$ of $f\in L\left(\mathbb{R}\right)$ be defined by ${f}^{ft}\left(t\right)={\int }_{\mathbb{R}}f\left(x\right){e}^{itx}\phantom{\rule{0.2em}{0ex}}dx$ in this paper. When ${f}_{\delta }^{ft}$ satisfies

$|{f}_{\delta }^{ft}\left(t\right)|\ge c{\left(1+|t|\right)}^{-\alpha }$
(1)

with $c>0$ and $\alpha >0$, there exist lots of optimal estimations for ${f}_{X}$ . However, many noise densities ${f}_{\delta }$ have zeros in the Fourier transform domain, i.e., the inequality (1) does not hold. For example, Sun et al. described an experiment where data on the velocity of halo stars in the Milky Way are collected, and where the measurement errors are assumed to be uniformly distributed . The classical kernel method provides a slower convergence rate in that case . Delaigle and Meister  developed a new method for a density ${f}_{\delta }$ with

$|{f}_{\delta }^{ft}\left(t\right)|\ge c{|sin\left(\frac{\pi t}{\lambda }\right)|}^{v}{\left(1+|t|\right)}^{-\alpha }.$
(2)

Here $c,\lambda ,\alpha >0$ and $v\in \mathbb{N}$ (non-negative integer set). Such noises are called Fourier-oscillating. Clearly, (2) allows ${f}_{\delta }^{ft}$ having zeros for $v\ge 1$. When $v=0$, (2) reduces to (1) (non-zero case).

Delaigle and Meister defined a kernel estimator ${\stackrel{ˆ}{f}}_{n}$ for a density ${f}_{X}$ in a Sobolev space and prove that with EX denoting the expectation of a random variable X,

$\underset{{f}_{X}\in {W}_{2}^{s}\left(L\right)}{sup}E{\int }_{a}^{b}{|{\stackrel{ˆ}{f}}_{n}\left(x\right)-{f}_{X}\left(x\right)|}^{2}\phantom{\rule{0.2em}{0ex}}dx=O\left({n}^{-\frac{2s}{2s+2\alpha +1}}\right)$
(3)

under the assumption (2) (Theorem 4.1 in ). Here, ${W}_{2}^{s}\left(L\right)$ stands for the Sobolev ball with the radius L. This above convergence rate attains the same one as in the non-zero case [14, 6]. In particular, it does not depend on the parameter v.

It seems that many papers deal with ${L}^{2}$ estimations. However, ${L}^{p}$ estimations ($1\le p\le +\mathrm{\infty }$) are important [5, 12]. On the other hand, Besov spaces contain many classical spaces (e.g., ${L}^{2}$ Sobolev spaces and Hölder spaces) as special examples. The current paper extends (3) from ${W}_{2}^{s}\left(L\right)$ to the Besov ball ${B}_{r,q}^{s}\left(L\right)$, and from ${L}^{2}$ to ${L}^{p}$ risk estimations. In addition, our results contain estimations for d th derivatives ${f}_{X}^{\left(d\right)}$ of ${f}_{X}$. The next section provides a linear wavelet estimation for ${f}_{X}^{\left(d\right)}$ over a Besov ball ${B}_{r,q}^{s}\left(L\right)$ and over ${L}^{p}$ risk ($p,q,r\ge 1$). It turns out that our estimation reduces to (3), when $d=0$, $p=r=q=2$. Moreover, we show a nonlinear wavelet estimation which improves the linear one for $r\le p$ in the last part.

### 1.1 Wavelet basis

The fundamental method to construct a wavelet basis comes from the concept of multiresolution analysis (MRA ). It is defined as a sequence of closed subspaces $\left\{{V}_{j}\right\}$ of the square integrable function space ${L}^{2}\left(\mathbb{R}\right)$ satisfying the following properties:

1. (i)

${V}_{j}\subset {V}_{j+1}$, $j\in \mathbb{Z}$ (the integer set);

2. (ii)

$f\left(x\right)\in {V}_{j}$ if and only if $f\left(2x\right)\in {V}_{j+1}$ for each $j\in \mathbb{Z}$;

3. (iii)

$\overline{{\bigcup }_{j\in \mathbb{Z}}{V}_{j}}={L}^{2}\left(\mathbb{R}\right)$ (the space ${\bigcup }_{j\in \mathbb{Z}}{V}_{j}$ is dense in ${L}^{2}\left(\mathbb{R}\right)$);

4. (iv)

There exists $\phi \left(x\right)\in {L}^{2}\left(\mathbb{R}\right)$ (scaling function) such that ${\left\{\phi \left(x-k\right)\right\}}_{k\in \mathbb{Z}}$ forms an orthonormal basis of ${V}_{0}=\overline{span}{\left\{\phi \left(x-k\right)\right\}}_{k\in \mathbb{Z}}$.

With the standard notation ${h}_{j,k}\left(x\right):={2}^{\frac{j}{2}}h\left({2}^{j}x-k\right)$ in wavelet analysis, we can find a corresponding wavelet function

such that, for a fixed $j\in \mathbb{Z}$, ${\left\{{\psi }_{j,k}\right\}}_{k\in \mathbb{Z}}$ constitutes an orthonormal basis of the orthogonal complement ${W}_{j}$ of ${V}_{j}$ in ${V}_{j+1}$ . Then each $f\in {L}^{2}\left(\mathbb{R}\right)$ has an expansion in ${L}^{2}\left(\mathbb{R}\right)$ sense,

$f=\sum _{k\in \mathbb{Z}}{\alpha }_{j,k}{\phi }_{j,k}+\sum _{l=j}^{\mathrm{\infty }}\sum _{k\in \mathbb{Z}}{\beta }_{l,k}{\psi }_{l,k},$

where ${\alpha }_{j,k}=〈f,{\phi }_{j,k}〉$, ${\beta }_{l,k}=〈f,{\psi }_{l,k}〉$.

A family of important examples are Daubechies wavelets ${D}_{2N}\left(x\right)$, which are compactly supported in time domain . They can be smooth enough with increasing supports as N gets large, although ${D}_{2N}$ do not have analytic formulas except for $N=1$.

As usual, let ${P}_{j}$ and ${Q}_{j}$ be the orthogonal projections from ${L}^{2}\left(\mathbb{R}\right)$ to ${V}_{j}$ and ${W}_{j}$, respectively,

${P}_{j}f=\sum _{k\in \mathbb{Z}}{\alpha }_{j,k}{\phi }_{j,k},\phantom{\rule{2em}{0ex}}{Q}_{j}f=\sum _{k\in \mathbb{Z}}{\beta }_{j,k}{\psi }_{j,k}=\left({P}_{j+1}-{P}_{j}\right)f.$

The following simple lemma is fundamental in our discussions. We use ${\parallel f\parallel }_{p}$ to denote ${L}^{p}\left(\mathbb{R}\right)$ norm for $f\in {L}^{p}\left(\mathbb{R}\right)$, and ${\parallel \lambda \parallel }_{p}$ do ${l}_{p}\left(\mathbb{Z}\right)$ norm for $\lambda \in {l}_{p}\left(\mathbb{Z}\right)$, where

${l}_{p}\left(\mathbb{Z}\right):=\left\{\begin{array}{ll}\left\{\lambda =\left\{{\lambda }_{k}\right\},{\sum }_{k\in \mathbb{Z}}{|{\lambda }_{k}|}^{p}<+\mathrm{\infty }\right\},& 1\le p<\mathrm{\infty };\\ \left\{\lambda =\left\{{\lambda }_{k}\right\},{sup}_{k\in \mathbb{Z}}|{\lambda }_{k}|<+\mathrm{\infty }\right\},& p=\mathrm{\infty }.\end{array}$

By using Proposition 8.3 in , we have the following conclusion.

Lemma 1.1 Let h be a Daubechies scaling function or the corresponding wavelet. Then there exists ${c}_{2}\ge {c}_{1}>0$ such that, for $\lambda =\left\{{\lambda }_{k}\right\}\in {l}_{p}\left(\mathbb{Z}\right)$ and $1\le p\le \mathrm{\infty }$,

${c}_{1}{2}^{j\left(\frac{1}{2}-\frac{1}{p}\right)}{\parallel \lambda \parallel }_{p}\le {\parallel \sum _{k\in \mathbb{Z}}{\lambda }_{k}{h}_{j,k}\parallel }_{p}\le {c}_{2}{2}^{j\left(\frac{1}{2}-\frac{1}{p}\right)}{\parallel \lambda \parallel }_{p}.$

### 1.2 Besov spaces

One of the advantages of wavelet bases is that they can characterize Besov spaces. To introduce those spaces (see ), we need the Sobolev spaces with integer order ${W}_{p}^{n}\left(\mathbb{R}\right):=\left\{f\in {L}^{p}\left(\mathbb{R}\right),{f}^{\left(n\right)}\in {L}^{p}\left(\mathbb{R}\right)\right\}$ and ${\parallel f\parallel }_{{W}_{p}^{n}}:={\parallel f\parallel }_{p}+{\parallel {f}^{\left(n\right)}\parallel }_{p}$. Then ${L}^{p}\left(\mathbb{R}\right)$ can be considered as ${W}_{p}^{0}\left(\mathbb{R}\right)$.

For $1\le p,q\le \mathrm{\infty }$, $s=n+\alpha$ with $n\in \mathbb{N}$ and $\alpha \in \left(0,1\right]$, the Besov spaces are defined by

${B}_{p,q}^{s}\left(\mathbb{R}\right):=\left\{f\in {W}_{p}^{n}\left(\mathbb{R}\right),{\left\{{2}^{j\alpha }{\omega }_{p}^{2}\left({f}^{\left(n\right)},{2}^{-j}\right)\right\}}_{j\in \mathbb{Z}}\in {l}_{q}\left(\mathbb{Z}\right)\right\},$

with the associated norm ${\parallel f\parallel }_{{B}_{p,q}^{s}}:={\parallel f\parallel }_{{W}_{p}^{n}}+{\parallel {\left\{{2}^{j\alpha }{\omega }_{p}^{2}\left({f}^{\left(n\right)},{2}^{-j}\right)\right\}}_{j\in \mathbb{Z}}\parallel }_{{l}_{q}}$, where

${\omega }_{p}^{2}\left(f,t\right):=\underset{|h|\le t}{sup}{\parallel f\left(\cdot +2h\right)-2f\left(\cdot +h\right)+f\left(\cdot \right)\parallel }_{p}$

stands for the smoothness modulus of f. It should be pointed out that ${B}_{2,2}^{s}\left(\mathbb{R}\right)={W}_{2}^{s}\left(\mathbb{R}\right)$.

According to Theorem 9.6 in , the following result holds.

Lemma 1.2 Let $\phi ={D}_{2N}$ be a Daubechies scaling function with large N and ψ be the corresponding wavelet. If $f\in {L}^{p}\left(\mathbb{R}\right)$, $1\le p,q\le \mathrm{\infty }$, $s>0$, ${\alpha }_{0,k}=〈f,{\phi }_{0,k}〉$, and ${\beta }_{j,k}=〈f,{\psi }_{j,k}〉$, then the following assertions are equivalent:

1. (i)

$f\in {B}_{p,q}^{s}\left(\mathbb{R}\right)$;

2. (ii)

${\parallel {\alpha }_{0,\cdot }\parallel }_{{l}_{p}}+{\parallel {\left\{{2}^{j\left(s+\frac{1}{2}-\frac{1}{p}\right)}{\parallel {\beta }_{j,\cdot }\parallel }_{{l}_{p}}\right\}}_{j\ge 0}\parallel }_{{l}_{q}}<+\mathrm{\infty }$;

3. (iii)

${\left\{{2}^{js}{\parallel {P}_{j}f-f\parallel }_{p}\right\}}_{j\ge 0}\in {l}_{q}$, where ${P}_{j}$ is the projection operator to ${V}_{j}$.

In each case,

${\parallel f\parallel }_{{B}_{p,q}^{s}}\sim {\parallel {\alpha }_{0,\cdot }\parallel }_{{l}_{p}}+{\parallel {\left\{{2}^{j\left(s+\frac{1}{2}-\frac{1}{p}\right)}{\parallel {\beta }_{j,\cdot }\parallel }_{{l}_{p}}\right\}}_{j\ge 0}\parallel }_{{l}_{q}}\sim {\parallel {P}_{0}f\parallel }_{p}+{\parallel {\left\{{2}^{js}{\parallel {P}_{j}f-f\parallel }_{p}\right\}}_{j\ge 0}\parallel }_{{l}_{q}}.$

Here and throughout, $A\lesssim B$ denotes $A\le CB$ for some constant $C>0$; $A\gtrsim B$ means $B\lesssim A$; we use $A\sim B$ standing for both $A\lesssim B$ and $B\lesssim A$.

Note that ${l}_{{p}_{1}}$ is continuously embedded into ${l}_{{p}_{2}}$ for ${p}_{1}\le {p}_{2}$. Then the above lemma implies that

## 2 Linear wavelet estimation

We shall provide a linear wavelet estimation for a compactly supported density function ${f}_{X}$ and its derivatives ${f}_{X}^{\left(d\right)}$ under Fourier-oscillating noises in this section, motivated by the work of Delaigle and Meister. It turns out that our result generalizes their theorem.

As in , we define

$p\left(x\right)=\sum _{m=0}^{v}\left(\begin{array}{c}v\\ m\end{array}\right){\left(-1\right)}^{v-m}{f}_{X}\left(x-\frac{2\pi m}{\lambda }\right).$
(4)

Then $p\in L\left(\mathbb{R}\right)$ and ${p}^{ft}\left(t\right)={\left({e}^{i\frac{2\pi t}{\lambda }}-1\right)}^{v}{f}_{X}^{ft}\left(t\right)$. Delaigle and Meister found that

${f}_{X}\left(x\right)=\sum _{m=0}^{J}{\eta }_{m}p\left(x-\frac{2\pi m}{\lambda }\right),$
(5)

where J and ${\eta }_{m}$ depend only on v and the support length of ${f}_{X}$.

Let $\phi ={D}_{2N}$ be the Daubechies scaling function with N large enough. Since both ${f}_{X}$ and φ have compact supports, the set ${K}_{j}:=\left\{k\in \mathbb{Z}:〈{f}_{X},{\phi }_{j,k}〉\ne 0\right\}$ is finite and the cardinality $|{K}_{j}|\sim {2}^{j}$. Then with ${\alpha }_{j,k}=〈{f}_{X}^{\left(d\right)},{\phi }_{j,k}〉$,

${P}_{j}{f}_{X}^{\left(d\right)}=\sum _{k\in {K}_{j}}{\alpha }_{j,k}{\phi }_{j,k}.$

It is easy to see ${\alpha }_{j,k}={\left(-1\right)}^{d}〈{f}_{X},{\left({\phi }_{j,k}\right)}^{\left(d\right)}〉$. This with (5) and the Plancherel formula leads to

${\alpha }_{j,k}=\frac{{\left(-1\right)}^{d}}{2\pi }〈\sum _{m=0}^{J}{\eta }_{m}{e}^{i\frac{2\pi mt}{\lambda }}{p}^{ft}\left(t\right),{\left[{\left({\phi }_{j,k}\right)}^{\left(d\right)}\right]}^{ft}\left(t\right)〉.$
(6)

Note that ${p}^{ft}\left(t\right)={\left({e}^{\frac{2\pi it}{\lambda }}-1\right)}^{v}\frac{{f}_{W}^{ft}\left(t\right)}{{f}_{\delta }^{ft}\left(t\right)}$ and ${\left[{\left({\phi }_{j,k}\right)}^{\left(d\right)}\right]}^{ft}\left(t\right)={2}^{-\frac{j}{2}+dj}{e}^{ik{2}^{-j}t}{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)$. Then the identity (6) reduces to

${\alpha }_{j,k}=\frac{{\left(-1\right)}^{d}}{2\pi }\int {2}^{-\frac{j}{2}+dj}\xi \left(t\right)\frac{\overline{{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)}}{{f}_{\delta }^{ft}\left(t\right)}{e}^{-ik{2}^{-j}t}{f}_{W}^{ft}\left(t\right)\phantom{\rule{0.2em}{0ex}}dt$
(7)

where $\xi \left(t\right)={\sum }_{m=0}^{J}{\eta }_{m}{e}^{i\frac{2\pi mt}{\lambda }}{\left({e}^{i\frac{2\pi t}{\lambda }}-1\right)}^{v}$. Since the empirical estimator for ${f}_{W}^{ft}$ is $\frac{1}{n}{\sum }_{l=1}^{n}{e}^{i{W}_{l}t}$, it is natural to define a linear wavelet estimator

${\stackrel{ˆ}{f}}_{n,d}^{lin}\left(x\right)=\sum _{k\in {K}_{j}}{\stackrel{ˆ}{\alpha }}_{j,k}{\phi }_{j,k}\left(x\right),$
(8)

with

${\stackrel{ˆ}{\alpha }}_{j,k}=\frac{1}{n}\sum _{l=1}^{n}\frac{{\left(-1\right)}^{d}}{2\pi }\int {2}^{-\frac{j}{2}+dj}\xi \left(t\right)\frac{\overline{{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)}}{{f}_{\delta }^{ft}\left(t\right)}{e}^{-ik{2}^{-j}t}{e}^{i{W}_{l}t}\phantom{\rule{0.2em}{0ex}}dt.$
(9)

When $v=d=0$, $p\left(x\right)={f}_{X}\left(x\right)$ and ${\eta }_{m}={\delta }_{m,0}$. Then our estimator ${\stackrel{ˆ}{f}}_{n,d}^{lin}$ reduces to the classical linear estimator for the case ${f}_{\delta }^{ft}$ having no zeros (see e.g., ).

Let ψ be the Daubechies wavelet function corresponding to the scaling function ${D}_{2N}$ and ${\beta }_{j,k}=〈{f}^{\left(d\right)},{\psi }_{j,k}〉$. Similar to (9), we define

${\stackrel{ˆ}{\beta }}_{j,k}=\frac{1}{n}\sum _{l=1}^{n}\frac{{\left(-1\right)}^{d}}{2\pi }\int {2}^{-\frac{j}{2}+dj}\xi \left(t\right)\frac{\overline{{\left[{\psi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)}}{{f}_{\delta }^{ft}\left(t\right)}{e}^{-ik{2}^{-j}t}{e}^{i{W}_{l}t}\phantom{\rule{0.2em}{0ex}}dt.$
(10)

Then it can easily be seen that $E{\stackrel{ˆ}{\alpha }}_{j,k}={\alpha }_{j,k}$, $E{\stackrel{ˆ}{\beta }}_{j,k}={\beta }_{j,k}$ and $E{\stackrel{ˆ}{f}}_{n,d}^{lin}={P}_{j}{f}_{X}^{\left(d\right)}$.

For $a, $L>0$, we consider the subset of ${B}_{p,q}^{s}\left(\mathbb{R}\right)$,

${B}_{p,q}^{s}\left(a,b,L\right):=\left\{f\in {B}_{p,q}^{s}\left(\mathbb{R}\right):{\parallel f\parallel }_{{B}_{p,q}^{s}}\le L,f\left(x\right)\ge 0,{\int }_{\mathbb{R}}f\left(x\right)\phantom{\rule{0.2em}{0ex}}dx=1,suppf\subset \left[a,b\right]\right\}$

in this paper.

Lemma 2.1 Let $\phi ={D}_{2N}$ (N large enough), ψ be the corresponding wavelet and ${f}_{\delta }^{ft}$ satisfy (2). If ${f}_{X}\in {B}_{r,q}^{s+d}\left(a,b,L\right)$, $r,q\in \left[1,+\mathrm{\infty }\right]$, $s>\frac{1}{r}$ and ${2}^{j}\le n$, then, for $p\in \left[1,+\mathrm{\infty }\right)$,

$E{|{\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}|}^{p}\lesssim {n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)},\phantom{\rule{2em}{0ex}}E{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|}^{p}\lesssim {n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}.$

Proof One shows only the first inequality; the second one is similar. Define

${Z}_{l,k}:=\frac{{\left(-1\right)}^{d}}{2\pi }\int {2}^{-\frac{j}{2}+dj}\xi \left(t\right)\frac{\overline{{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)}}{{f}_{\delta }^{ft}\left(t\right)}{e}^{-ik{2}^{-j}t}{e}^{i{W}_{l}t}\phantom{\rule{0.2em}{0ex}}dt.$

Then ${\stackrel{ˆ}{\alpha }}_{j,k}=\frac{1}{n}{\sum }_{l=1}^{n}{Z}_{l,k}$ and

${\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}=\frac{1}{n}\sum _{l=1}^{n}\left({Z}_{l,k}-E{Z}_{l,k}\right):=\frac{1}{n}\sum _{l=1}^{n}{Y}_{l,k}.$
(11)

Clearly, $E{Y}_{l,k}=0$. One estimates $|{Y}_{l,k}|$ and $E{|{Y}_{l,k}|}^{2}$ in order to use the Rosenthal inequality: By the assumption (2), $|{Z}_{l,k}|$ $\int {2}^{-\frac{j}{2}+dj}{|{e}^{i\frac{2\pi t}{\lambda }}-1|}^{v}\frac{|{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)|}{|{f}_{\delta }^{ft}\left(t\right)|}\phantom{\rule{0.2em}{0ex}}dt$ $\int {2}^{-\frac{j}{2}+dj}{\left(1+|t|\right)}^{\alpha }|{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)|\phantom{\rule{0.2em}{0ex}}dt$ = $\int {2}^{\frac{j}{2}+dj}{\left(1+|{2}^{j}t|\right)}^{\alpha }|{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left(t\right)|\phantom{\rule{0.2em}{0ex}}dt$ ${2}^{j\left(\alpha +d+\frac{1}{2}\right)}\int {\left(1+|t|\right)}^{\alpha }|{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left(t\right)|\phantom{\rule{0.2em}{0ex}}dt$. Because $\phi ={D}_{2N}$, the last integration is finite for large N. Hence,

$|{Y}_{l,k}|\le 2|{Z}_{l,k}|\lesssim {2}^{j\left(\alpha +d+\frac{1}{2}\right)}.$
(12)

Since ${f}_{X}\in {B}_{r,q}^{s+d}\left(\mathbb{R}\right)\subset {B}_{\mathrm{\infty },q}^{s+d-\frac{1}{r}}\left(\mathbb{R}\right)$ ($s>\frac{1}{r}$), ${\parallel {f}_{X}\parallel }_{\mathrm{\infty }}<+\mathrm{\infty }$ and ${\parallel {f}_{W}\parallel }_{\mathrm{\infty }}={\parallel {f}_{X}\ast {f}_{\delta }\parallel }_{\mathrm{\infty }}\le {\parallel {f}_{X}\parallel }_{\mathrm{\infty }}{\parallel {f}_{\delta }\parallel }_{1}<+\mathrm{\infty }$. This with the Parseval identity shows

$\begin{array}{rcl}E{Z}_{l,k}^{2}& =& \int {|\frac{{\left(-1\right)}^{d}}{2\pi }\int {2}^{-\frac{j}{2}+dj}\xi \left(t\right)\frac{\overline{{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)}}{{f}_{\delta }^{ft}\left(t\right)}{e}^{-ik{2}^{-j}t}{e}^{ity}\phantom{\rule{0.2em}{0ex}}dt|}^{2}{f}_{W}\left(y\right)\phantom{\rule{0.2em}{0ex}}dy\\ \lesssim & {\parallel {f}_{W}\parallel }_{\mathrm{\infty }}\int {|{2}^{-\frac{j}{2}+dj}\xi \left(t\right)\frac{\overline{{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)}}{{f}_{\delta }^{ft}\left(t\right)}{e}^{-ik{2}^{-j}t}|}^{2}\phantom{\rule{0.2em}{0ex}}dt.\end{array}$

Furthermore, one obtains $E{Z}_{l,k}^{2}\lesssim \int {|{2}^{-\frac{j}{2}+dj}{\left(1+|t|\right)}^{\alpha }{\left[{\phi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)|}^{2}\phantom{\rule{0.2em}{0ex}}dt\lesssim {2}^{2j\left(\alpha +d\right)}$ thanks to (2). Hence,

$E{Y}_{l,k}^{2}=E{|{Z}_{l,k}-E{Z}_{l,k}|}^{2}\le E{Z}_{l,k}^{2}\lesssim {2}^{2j\left(\alpha +d\right)}.$
(13)

According to (11) and the Rosenthal inequality,

$E{|{\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}|}^{p}\lesssim \left\{\begin{array}{ll}{n}^{-p}{\left({\sum }_{l=1}^{n}E{|{Y}_{l,k}|}^{2}\right)}^{\frac{p}{2}},& p\in \left[1,2\right),\\ {n}^{-p}{\sum }_{l=1}^{n}E{|{Y}_{l,k}|}^{p}+{n}^{-p}{\left({\sum }_{l=1}^{n}E{|{Y}_{l,k}|}^{2}\right)}^{\frac{p}{2}},& p\in \left[2,+\mathrm{\infty }\right).\end{array}$

Combining this with (13), one obtains $E{|{\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}|}^{p}\lesssim {n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}$ for $1\le p<2$, which is the desired conclusion. When $p\ge 2$,

$E{|{Y}_{l,k}|}^{p}\lesssim {\parallel {Y}_{l,k}\parallel }_{\mathrm{\infty }}^{p-2}E{|{Y}_{l,k}|}^{2}\lesssim {2}^{j\left(\alpha +d+\frac{1}{2}\right)\left(p-2\right)}{2}^{2j\left(\alpha +d\right)}={2}^{j\left(\frac{p}{2}-1+\alpha p+dp\right)}$

due to (12) and (13). Moreover, $E{|{\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}|}^{p}\lesssim {n}^{1-p}{2}^{j\left(\frac{p}{2}-1+\alpha p+dp\right)}+{n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}$. Since ${2}^{j}\le n$,

${n}^{1-p}{2}^{j\left(\frac{p}{2}-1+\alpha p+dp\right)}={n}^{-\frac{p}{2}}{\left(\frac{{2}^{j}}{n}\right)}^{\frac{p}{2}-1}{2}^{jp\left(\alpha +d\right)}\le {n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}.$

Finally, $E{|{\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}|}^{p}\lesssim {n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}$. This completes the proof of the first part of Lemma 2.1. □

Now, we are in a position to state our first theorem.

Theorem 2.1 Let ${\stackrel{ˆ}{f}}_{n,d}^{lin}$ be defined by (8)-(9), $r,q\in \left[1,+\mathrm{\infty }\right]$, $p\in \left[1,+\mathrm{\infty }\right)$ and $s>\frac{1}{r}$. Then with ${s}^{\prime }:=s-{\left(\frac{1}{r}-\frac{1}{p}\right)}_{+}$ and ${x}_{+}=max\left\{0,x\right\}$,

$\underset{{f}_{X}\in {B}_{r,q}^{s+d}\left(a,b,L\right)}{sup}E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {n}^{-\frac{{s}^{\prime }p}{2{s}^{\prime }+2\alpha +2d+1}}.$

Proof It is easy to see that ${\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{p}\lesssim {\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{r}$ for $r\ge p$, because ${L}_{r}\left(\left[a,b\right]\right)$ is continuously embedded into ${L}_{p}\left(\left[a,b\right]\right)$. Moreover, $E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{r}^{p}\le {\left(E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{r}^{r}\right)}^{\frac{p}{r}}$ thanks to Jensen’s inequality.

When $r\le p$, ${s}^{\prime }-\frac{1}{p}=s-\frac{1}{r}$ and ${B}_{r,q}^{s+d}\left(a,b,L\right)\subset {B}_{p,q}^{{s}^{\prime }+d}\left(a,b,L\right)$. Then

$\underset{{f}_{X}\in {B}_{r,q}^{s+d}\left(a,b,L\right)}{sup}E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\le \underset{{f}_{X}\in {B}_{p,q}^{{s}^{\prime }+d}\left(a,b,L\right)}{sup}E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}.$

Therefore, it suffices to prove the theorem, for $r=p$,

$\underset{{f}_{X}\in {B}_{p,q}^{s+d}\left(a,b,L\right)}{sup}E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {n}^{-\frac{sp}{2s+2\alpha +2d+1}}.$
(14)

If ${f}_{X}\in {B}_{p,q}^{s+d}\left(\mathbb{R}\right)$, then ${f}_{X}^{\left(d\right)}\in {B}_{p,q}^{s}\left(\mathbb{R}\right)$ and

${\parallel {P}_{j}{f}_{X}^{\left(d\right)}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {2}^{-jsp}$
(15)

due to Lemma 1.2. On the other hand, ${\stackrel{ˆ}{f}}_{n,d}^{lin}-{P}_{j}{f}_{X}^{\left(d\right)}={\sum }_{k\in {K}_{j}}\left({\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}\right){\phi }_{j,k}$ and

$E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{P}_{j}{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {2}^{j\left(\frac{p}{2}-1\right)}\sum _{k\in {K}_{j}}E{|{\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}|}^{p}\lesssim {2}^{j\frac{p}{2}}\underset{k\in {K}_{j}}{sup}E{|{\stackrel{ˆ}{\alpha }}_{j,k}-{\alpha }_{j,k}|}^{p}$

because of Lemma 1.1 and $|{K}_{j}|\sim {2}^{j}$. This with Lemma 2.1 leads to $E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{P}_{j}{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {\left(\frac{{2}^{j}}{n}\right)}^{\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}$. Combining this with (15), one obtains

$E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{lin}-{P}_{j}{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}+{\parallel {P}_{j}{f}_{X}^{\left(d\right)}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {\left(\frac{{2}^{j}}{n}\right)}^{\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}+{2}^{-jsp}.$

Take ${2}^{j}\sim {n}^{\frac{1}{2s+2\alpha +2d+1}}$. Then the inequality (14) follows, and the proof of Theorem 2.1 is finished. □

Remark 2.1 If $p=q=r=2$ and $d=0$, then ${s}^{\prime }=s$, ${B}_{r,q}^{s+d}\left(a,b,L\right)={W}_{2}^{s}\left(a,b,L\right)$, Theorem 2.1 reduces to Theorem 4.1 in .

Remark 2.2 From the choice ${2}^{j}\sim {n}^{\frac{1}{2s+2\alpha +2d+1}}$ in the proof of Theorem 2.1, we find that our estimator is not adaptive, because it depends on the parameter s of ${B}_{r,q}^{s}\left(\mathbb{R}\right)$. In order to avoid that shortcoming, we study a nonlinear estimation in the next part.

## 3 Nonlinear estimation

This section is devoted to an adaptive nonlinear estimation, which also improves the convergence rate of the linear one in some cases. The idea of proof comes from . Choose ${r}_{0}>s$,

${2}^{{j}_{0}}\sim {n}^{\frac{1}{2{r}_{0}+2\alpha +2d+1}}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{2}^{{j}_{1}}\sim {n}^{\frac{1}{2\alpha +2d+1}}.$
(16)

Let ${\stackrel{ˆ}{\alpha }}_{j,k}$ and ${\stackrel{ˆ}{\beta }}_{j,k}$ be defined by (9) and (10), respectively, and

${\stackrel{˜}{\beta }}_{j,k}=\left\{\begin{array}{ll}{\stackrel{ˆ}{\beta }}_{j,k},& |{\stackrel{ˆ}{\beta }}_{j,k}|>T{2}^{j\left(\alpha +d\right)}\sqrt{j/n},\\ 0,& \text{otherwise},\end{array}$

where the constant T will be determined in the proof of Theorem 3.1. Then we define a nonlinear wavelet estimator

${\stackrel{ˆ}{f}}_{n,d}^{non}\left(x\right):=\sum _{k\in {K}_{{j}_{0}}}{\stackrel{ˆ}{\alpha }}_{{j}_{0},k}{\phi }_{{j}_{0},k}\left(x\right)+\sum _{j={j}_{0}}^{{j}_{1}}\sum _{k\in {I}_{j}}{\stackrel{˜}{\beta }}_{j,k}{\psi }_{j,k}\left(x\right),$

where ${K}_{{j}_{0}}:=\left\{k\in \mathbb{Z}:〈{f}_{X},{\phi }_{{j}_{0},k}〉\ne 0\right\}$, and ${I}_{j}:=\left\{k\in \mathbb{Z}:〈{f}_{X},{\psi }_{j,k}〉\ne 0\right\}$. Clearly, the cardinality $|{I}_{j}|\sim {2}^{j}$ since both ${f}_{X}$ and ψ have compact supports.

Lemma 3.1 If $j{2}^{j}\le n$, then there exists ${c}_{0}>0$ such that, for each $T\ge {T}_{0}>0$,

$P\left\{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{j/n}\right\}\lesssim {2}^{-{c}_{0}Tj}.$

Proof By the definitions of ${\beta }_{j,k}$ and ${\stackrel{ˆ}{\beta }}_{j,k}$, ${\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}=\frac{1}{n}{\sum }_{l=1}^{n}\left({Z}_{l,k}-E{Z}_{l,k}\right):=\frac{1}{n}{\sum }_{l=1}^{n}{Y}_{l,k}$, where

${Z}_{l,k}:=\frac{{\left(-1\right)}^{d}}{2\pi }\int {2}^{-\frac{j}{2}+dj}\sum _{m=0}^{J}{\eta }_{m}{e}^{it\frac{2\pi m}{\lambda }}{\left({e}^{\frac{2\pi it}{\lambda }}-1\right)}^{v}\frac{\overline{{\left[{\psi }^{\left(d\right)}\right]}^{ft}\left({2}^{-j}t\right)}}{{f}_{\delta }^{ft}\left(t\right)}{e}^{-i{2}^{-j}kt}{e}^{i{W}_{l}t}\phantom{\rule{0.2em}{0ex}}dt.$

Then $E{Y}_{l,k}=0$ and with $\lambda =\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}$,

$P\left\{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}\right\}\le 2exp\left(-\frac{n{\lambda }^{2}}{2\left(E{Y}_{l,k}^{2}+{\parallel {Y}_{l,k}\parallel }_{\mathrm{\infty }}\lambda /3\right)}\right)$
(17)

thanks to the classical Bernstein inequality in . On the other hand, $E{Y}_{l,k}^{2}+{\parallel {Y}_{l,k}\parallel }_{\mathrm{\infty }}\lambda /3\lesssim {2}^{2j\left(\alpha +d\right)}+{2}^{\left(\alpha +d+\frac{1}{2}\right)j}\frac{T}{6}{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}\le CT{2}^{2j\left(\alpha +d\right)}$ because of (12), (13), and $j{2}^{j}\le n$. Hence, $\frac{n{\lambda }^{2}}{2\left(E{Y}_{l,k}^{2}+{\parallel {Y}_{l,k}\parallel }_{\mathrm{\infty }}\lambda /3\right)}\ge \frac{n\frac{{T}^{2}}{4}\frac{j}{n}{2}^{2j\left(\alpha +d\right)}}{2CT{2}^{2j\left(\alpha +d\right)}}=\frac{T}{8C}j$, and (17) reduces to

$P\left\{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{j/n}\right\}\lesssim 2{e}^{-\frac{T}{8C}j}={2}^{-{c}_{0}Tj}$

with ${c}_{0}=\frac{1}{8C}{log}_{2}e$. This completes the proof of Lemma 2.1. □

Theorem 3.1 Under the assumptions of Theorem  2.1, there exist $\theta >0$ and ${T}_{0}>0$ such that, for $T\ge {T}_{0}$,

$\begin{array}{r}\underset{{f}_{X}\in {B}_{r,q}^{s+d}\left(a,b,L\right)}{sup}E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\\ \phantom{\rule{1em}{0ex}}\lesssim \left\{\begin{array}{ll}{\left(lnn\right)}^{\theta }{n}^{-\frac{sp}{2s+2\alpha +2d+1}},& r\in \left(\frac{\left(2\alpha +2d+1\right)p}{2s+2\alpha +2d+1},p\right];\\ {\left(lnn\right)}^{\theta }{n}^{-\frac{{s}^{\prime }p}{2\left(s-1/r\right)+2\alpha +2d+1}},& r\in \left[1,\frac{\left(2\alpha +2d+1\right)p}{2s+2\alpha +2d+1}\right].\end{array}\end{array}$
(18)

Proof Similar to , one defines

$\mu :=min\left\{\frac{s}{2s+2\alpha +2d+1},\frac{{s}^{\prime }}{2\left(s-1/r\right)+2\alpha +2d+1}\right\},$

and

$\omega :=-sr+\left(\alpha +d+\frac{1}{2}\right)\left(p-r\right).$
(19)

It is easy to check that $\omega <0$ holds if and only if $r>\frac{\left(2\alpha +2d+1\right)p}{2s+2\alpha +2d+1}$, and $\mu =\frac{s}{2s+2\alpha +2d+1}$ as well as $\omega \ge 0$ if and only if $r\le \frac{\left(2\alpha +2d+1\right)p}{2s+2\alpha +2d+1}$, and $\mu =\frac{{s}^{\prime }}{2\left(s-1/r\right)+2\alpha +2d+1}$. Then the conclusion of Theorem 3.1 can be rewritten as

$\underset{{f}_{X}\in {B}_{r,q}^{s+d}\left(a,b,L\right)}{sup}E{\parallel {\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(20)

Choose ${j}_{0}\left(s,r,q\right)$ and ${j}_{1}\left(s,r,q\right)$ such that

${2}^{{j}_{0}\left(s,r,q\right)}\backsimeq {n}^{\frac{1-2\mu }{2\alpha +2d+1}}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{2}^{{j}_{1}\left(s,r,q\right)}\backsimeq {n}^{\frac{\mu }{{s}^{\prime }}}.$

Then it can easily be shown by (16) that ${2}^{{j}_{0}}\lesssim {2}^{{j}_{0}\left(s,r,q\right)}\lesssim {2}^{{j}_{1}\left(s,r,q\right)}\lesssim {2}^{{j}_{1}}$. Clearly,

${\parallel {\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {\parallel {P}_{{j}_{0}}\left({\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\right)\parallel }_{p}^{p}+{\parallel {D}_{{j}_{0},{j}_{1}}\left({\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\right)\parallel }_{p}^{p}+{\parallel {P}_{{j}_{1}}{f}_{X}^{\left(d\right)}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p},$
(21)

where

${D}_{{j}_{0},{j}_{1}}f=\sum _{j={j}_{0}}^{{j}_{1}}\sum _{k\in \mathbb{Z}}{\beta }_{j,k}{\psi }_{j,k}.$

By the assumption $r\le p$, ${s}^{\prime }:=s-\left(\frac{1}{r}-\frac{1}{p}\right)=s-\frac{1}{r}+\frac{1}{p}$ and ${B}_{r,q}^{s+d}\left(a,b,L\right)$ is continuously embedded into ${B}_{p,q}^{{s}^{\prime }+d}\left(a,b,L\right)$. Since ${f}_{X}\in {B}_{p,q}^{{s}^{\prime }+d}\left(\mathbb{R}\right)$, ${f}_{X}^{\left(d\right)}\in {B}_{p,q}^{{s}^{\prime }}\left(\mathbb{R}\right)$ and ${\parallel {P}_{{j}_{1}}{f}_{X}^{\left(d\right)}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {2}^{-{j}_{1}{s}^{\prime }p}$ thanks to Lemma 1.2. This with ${2}^{{j}_{1}\left(s,r,q\right)}\lesssim {2}^{{j}_{1}}$ and the definition of μ leads to

${\parallel {P}_{{j}_{1}}{f}_{X}^{\left(d\right)}-{f}_{X}^{\left(d\right)}\parallel }_{p}^{p}\lesssim {2}^{-{j}_{1}\left(s,r,p\right){s}^{\prime }p}\lesssim {n}^{-\mu p}.$
(22)

Note that ${P}_{{j}_{0}}\left({\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\right)={\sum }_{k\in {K}_{{j}_{0}}}\left({\stackrel{ˆ}{\alpha }}_{{j}_{0},k}-{\alpha }_{{j}_{0},k}\right){\phi }_{{j}_{0},k}$. Then $E{\parallel {P}_{{j}_{0}}\left({\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\right)\parallel }_{p}^{p}\lesssim {2}^{{j}_{0}\left(\frac{p}{2}-1\right)}{\sum }_{k\in {K}_{{j}_{0}}}E{|{\stackrel{ˆ}{\alpha }}_{{j}_{0},k}-{\alpha }_{{j}_{0},k}|}^{p}\lesssim {2}^{{j}_{0}\frac{p}{2}}{n}^{-\frac{p}{2}}{2}^{{j}_{0}p\left(\alpha +d\right)}$ due to Lemma 1.1, $|{K}_{{j}_{0}}|\sim {2}^{{j}_{0}}$ and Lemma 2.1. By ${2}^{{j}_{0}}\lesssim {2}^{{j}_{0}\left(s,r,q\right)}$ and the choice ${2}^{{j}_{0}\left(s,r,q\right)}\backsimeq {n}^{\frac{1-2\mu }{2\alpha +2d+1}}$,

$E{\parallel {P}_{{j}_{0}}\left({\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\right)\parallel }_{p}^{p}\lesssim {\left(\frac{{2}^{{j}_{0}\left(s,r,q\right)}}{n}\right)}^{p/2}{2}^{{j}_{0}\left(s,r,q\right)p\left(\alpha +d\right)}\lesssim {n}^{-\mu p}.$
(23)

According to (21)-(23), it is sufficient to prove $E{\parallel {D}_{{j}_{0},{j}_{1}}\left({\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\right)\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}$: Define

$\begin{array}{c}{\stackrel{ˆ}{B}}_{j}:=\left\{k:|{\stackrel{ˆ}{\beta }}_{j,k}|>T{2}^{j\left(\alpha +d\right)}\sqrt{j/n}\right\},\phantom{\rule{2em}{0ex}}{\stackrel{ˆ}{S}}_{j}={\stackrel{ˆ}{B}}_{j}^{c};\hfill \\ {B}_{j}:=\left\{k:|{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{j/n}\right\},\phantom{\rule{2em}{0ex}}{S}_{j}={B}_{j}^{c};\hfill \\ {B}_{j}^{\prime }:=\left\{k:|{\beta }_{j,k}|>2T{2}^{j\left(\alpha +d\right)}\sqrt{j/n}\right\},\phantom{\rule{2em}{0ex}}{S}_{j}^{\prime }={B}_{j}^{\prime c}.\hfill \end{array}$

Then ${D}_{{j}_{0},{j}_{1}}\left({\stackrel{ˆ}{f}}_{n,d}^{non}-{f}_{X}^{\left(d\right)}\right)$ = ${\sum }_{j={j}_{0}}^{{j}_{1}}{\sum }_{k\in {I}_{j}}\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)\left[I\left\{k\in {\stackrel{ˆ}{B}}_{j}\cap {S}_{j}\right\}$ + $I\left\{k\in {\stackrel{ˆ}{B}}_{j}\cap {B}_{j}\right\}\right]{\psi }_{j,k}$${\sum }_{j={j}_{0}}^{{j}_{1}}{\sum }_{k\in {I}_{j}}{\beta }_{j,k}\left[I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }\right\}$ + $I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {B}_{j}^{\prime }\right\}\right]{\psi }_{j,k}$ = ${e}_{bs}+{e}_{sb}+{e}_{bb}+{e}_{ss}$, where

$\begin{array}{c}{e}_{bs}:=\sum _{j={j}_{0}}^{{j}_{1}}\sum _{k\in {I}_{j}}\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in {\stackrel{ˆ}{B}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k},\phantom{\rule{2em}{0ex}}{e}_{sb}:=\sum _{j={j}_{0}}^{{j}_{1}}\sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {B}_{j}^{\prime }\right\}{\psi }_{j,k},\hfill \\ {e}_{bb}:=\sum _{j={j}_{0}}^{{j}_{1}}\sum _{k\in {I}_{j}}\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in {\stackrel{ˆ}{B}}_{j}\cap {B}_{j}\right\}{\psi }_{j,k},\phantom{\rule{2em}{0ex}}{e}_{ss}:=\sum _{j={j}_{0}}^{{j}_{1}}\sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }\right\}{\psi }_{j,k}.\hfill \end{array}$

In order to conclude Theorem 3.1, one needs only to show that

$E{\parallel {e}_{bs}\parallel }_{p}^{p}+E{\parallel {e}_{sb}\parallel }_{p}^{p}+E{\parallel {e}_{bb}\parallel }_{p}^{p}+E{\parallel {e}_{ss}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(24)

By (16), ${j}_{1}-{j}_{0}\sim lnn$ and ${\parallel {\sum }_{j={j}_{0}}^{{j}_{1}}{g}_{j}\parallel }_{p}^{p}\lesssim {\left({j}_{1}-{j}_{0}+1\right)}^{p-1}{\sum }_{j={j}_{0}}^{{j}_{1}}{\parallel {g}_{j}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{p-1}{\sum }_{j={j}_{0}}^{{j}_{1}}{\parallel {g}_{j}\parallel }_{p}^{p}$. This with Lemma 1.1 shows that, for $\stackrel{ˆ}{f}\left(x\right)={\sum }_{j={j}_{0}}^{{j}_{1}}{\sum }_{k\in {I}_{j}}{\stackrel{ˆ}{f}}_{j,k}{\psi }_{j,k}\left(x\right)$, there exists $\theta >0$ such that

$E{\parallel \stackrel{ˆ}{f}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }\underset{{j}_{0}\le j\le {j}_{1}}{sup}{2}^{j\left(\frac{p}{2}-1\right)}\sum _{k\in {I}_{j}}E{|{\stackrel{ˆ}{f}}_{j,k}|}^{p}.$
(25)

To estimate $E{\parallel {e}_{bs}\parallel }_{p}^{p}$, one takes ${\stackrel{ˆ}{f}}_{j,k}:=\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in {\stackrel{ˆ}{B}}_{j}\cap {S}_{j}\right\}$. Then, for each $k\in {\stackrel{ˆ}{B}}_{j}\cap {S}_{j}$, $|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|\ge |{\stackrel{ˆ}{\beta }}_{j,k}|-|{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}$ and ${\stackrel{ˆ}{B}}_{j}\cap {S}_{j}\subset {\stackrel{ˆ}{D}}_{j}:=\left\{k:|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}\right\}$. This with the Hölder inequality shows

$E{|{\stackrel{ˆ}{f}}_{j,k}|}^{p}\le E{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|}^{p}I\left\{k\in {\stackrel{ˆ}{D}}_{j}\right\}\le {\left(E{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|}^{2p}\right)}^{\frac{1}{2}}{\left(EI\left\{k\in {\stackrel{ˆ}{D}}_{j}\right\}\right)}^{\frac{1}{2}}.$

Clearly, $EI\left\{k\in {\stackrel{ˆ}{D}}_{j}\right\}=P\left\{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}\right\}$. Furthermore, using $|{I}_{j}|\lesssim {2}^{j}$, Lemma 2.1, and Lemma 3.1, one obtains ${\sum }_{k\in {I}_{j}}E{|{\stackrel{ˆ}{f}}_{j,k}|}^{p}\lesssim {2}^{j}{n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}{2}^{-\frac{{c}_{0}T}{2}j}$. Moreover, by (25),

$E{\parallel {e}_{bs}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }\underset{{j}_{0}\le j\le {j}_{1}}{sup}{n}^{-\frac{p}{2}}{2}^{j\left(\frac{p}{2}+\alpha p+dp-\frac{{c}_{0}T}{2}\right)}.$

Choose $T\ge \frac{p+2\alpha p+2dp}{{c}_{0}}$. Then $\frac{p}{2}+\alpha p+dp-\frac{{c}_{0}T}{2}\le 0$, and ${sup}_{{j}_{0}\le j\le {j}_{1}}{n}^{-\frac{p}{2}}{2}^{j\left(\frac{p}{2}+\alpha p+dp-\frac{{c}_{0}T}{2}\right)}\lesssim {n}^{-\frac{p}{2}}{2}^{{j}_{0}\left(\frac{p}{2}+\alpha p+dp-\frac{{c}_{0}T}{2}\right)}\lesssim {n}^{-\frac{p}{2}}{2}^{{j}_{0}\frac{p}{2}}{2}^{{j}_{0}p\left(\alpha +d\right)}$. Similar to (23), one has

$E{\parallel {e}_{bs}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(26)

In the proof of (26), one needs to choose ${T}_{0}\ge {c}_{0}^{-1}\left(p+2\alpha p+2dp\right)$.

Now, one considers $E{\parallel {e}_{sb}\parallel }_{p}^{p}$: For $k\in {\stackrel{ˆ}{S}}_{j}\cap {B}_{j}^{\prime }$, $|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|\ge |{\beta }_{j,k}|-|{\stackrel{ˆ}{\beta }}_{j,k}|>T{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}$ and ${\stackrel{ˆ}{S}}_{j}\cap {B}_{j}^{\prime }\subset {\stackrel{ˆ}{D}}_{j}:=\left\{k:|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|>\frac{T}{2}{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}\right\}$. By Lemma 3.1, $EI\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {B}_{j}^{\prime }\right\}\le EI\left\{k\in {\stackrel{ˆ}{D}}_{j}\right\}\lesssim {2}^{-{c}_{0}Tj}$. Since ${f}_{X}^{\left(d\right)}\in {B}_{r,q}^{s}\left(\mathbb{R}\right)\subset {B}_{p,q}^{{s}^{\prime }}\left(\mathbb{R}\right)$, ${\parallel {\beta }_{j,\cdot }\parallel }_{p}:={\parallel 〈{f}_{X}^{\left(d\right)},{\psi }_{j,\cdot }〉\parallel }_{p}\lesssim {\parallel {f}_{X}^{\left(d\right)}\parallel }_{{B}_{p,q}^{{s}^{\prime }}}{2}^{-j\left({s}^{\prime }+1/2-1/p\right)}$ and

$\sum _{k\in {I}_{j}}{|{\beta }_{j,k}|}^{p}EI\left\{k\in \stackrel{ˆ}{{S}_{j}}\cap {B}_{j}^{\prime }\right\}\lesssim {\parallel {\beta }_{j,\cdot }\parallel }_{p}^{p}{2}^{-{c}_{0}Tj}\lesssim {\parallel {f}_{X}^{\left(d\right)}\parallel }_{{B}_{p,q}^{{s}^{\prime }}}^{p}{2}^{-j\left({s}^{\prime }p+\frac{p}{2}-1+{c}_{0}T\right)}.$

Moreover, it follows from the definition of ${e}_{sb}$ and (25) that

$E{\parallel {e}_{sb}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }\underset{{j}_{0}\le j\le {j}_{1}}{sup}{2}^{-j\left({s}^{\prime }p+{c}_{0}T\right)}\le {\left(lnn\right)}^{\theta }{2}^{-{j}_{0}\left({s}^{\prime }p+{c}_{0}T\right)}.$

By (16), one can choose $T\ge \frac{{j}_{1}-{j}_{0}}{{c}_{0}{j}_{0}}{s}^{\prime }p$ (independent of n) so that ${j}_{1}{s}^{\prime }p\le {j}_{0}\left({s}^{\prime }p+{c}_{0}T\right)$. Hence, this above inequality reduces to $E{\parallel {e}_{sb}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{2}^{-{j}_{1}{s}^{\prime }p}$. Similar arguments to (22) lead to

$E{\parallel {e}_{sb}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(27)

In this above proof, one needs to choose ${T}_{0}\ge {\left({c}_{0}{j}_{0}\right)}^{-1}\left({j}_{1}-{j}_{0}\right){s}^{\prime }p$.

For $E{\parallel {e}_{bb}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}$, one uses $|{I}_{j}|\lesssim {2}^{j}$, (25), and Lemma 2.1 to find

$\mathrm{I}:=E{\parallel \sum _{j={j}_{0}}^{{j}_{0}\left(s,r,q\right)}\sum _{k\in {I}_{j}}\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in {\stackrel{ˆ}{B}}_{j}\cap {B}_{j}\right\}{\psi }_{j,k}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }\underset{{j}_{0}\le j\le {j}_{0}\left(s,r,q\right)}{sup}{n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d+\frac{1}{2}\right)}.$

Recall that ${j}_{0}\le {j}_{0}\left(s,r,q\right)$. Then $I\lesssim {\left(lnn\right)}^{\theta }{\left({n}^{-1}{2}^{{j}_{0}\left(s,r,q\right)}\right)}^{\frac{p}{2}}{2}^{{j}_{0}\left(s,r,q\right)p\left(\alpha +d\right)}$, which reduces to $\mathrm{I}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}$ by similar arguments of (23). It remains to show

$\mathrm{II}:=E{\parallel \sum _{j={j}_{0}\left(s,r,q\right)}^{{j}_{1}}\sum _{k\in {I}_{j}}\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in \stackrel{ˆ}{{B}_{j}}\cap {B}_{j}\right\}{\psi }_{j,k}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(28)

By Lemma 2.1 and the definition of ${B}_{j}$, ${\sum }_{k\in {I}_{j}}E{|\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in \stackrel{ˆ}{{B}_{j}}\cap {B}_{j}\right\}|}^{p}$ ${\sum }_{k\in {I}_{j}}E{|{\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}|}^{p}I\left\{k\in {B}_{j}\right\}$ ${n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}{\sum }_{k\in {B}_{j}}{|2{\beta }_{j,k}{T}^{-1}{2}^{-j\left(\alpha +d\right)}\sqrt{n{j}^{-1}}|}^{r}$. According to Lemma 1.2, ${\parallel {\beta }_{j,\cdot }\parallel }_{r}^{r}\lesssim {\parallel {f}_{X}^{\left(d\right)}\parallel }_{{B}_{r,q}^{s}}^{r}{2}^{-j\left(sr+r/2-1\right)}$. Hence,

$\sum _{k\in {I}_{j}}E{|\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in \stackrel{ˆ}{{B}_{j}}\cap {B}_{j}\right\}|}^{p}\lesssim {n}^{-\frac{p-r}{2}}{\left(lnn\right)}^{\theta }{2}^{-j\left(sr+\frac{r}{2}-1-\alpha p-dp+\alpha r+dr\right)}.$

Combining this above inequality with (25), one obtains

$\mathrm{II}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p-r}{2}}\underset{{j}_{0}\left(s,r,q\right)\le j\le {j}_{1}}{sup}{2}^{j\omega }:={A}_{n},$

where $\omega :=-sr+\left(\alpha +d+\frac{1}{2}\right)\left(p-r\right)=-sr-\frac{r}{2}+\frac{p}{2}+\alpha p+dp-\alpha r-dr$ as defined in (19).

When $\omega \le 0$, ${A}_{n}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p-r}{2}}{2}^{{j}_{0}\left(s,r,q\right)\omega }$. By the choice ${2}^{{j}_{0}\left(s,r,q\right)}\sim {n}^{\frac{1-2\mu }{2\alpha +2d+1}}$,

${A}_{n}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p-r}{2}}{n}^{\frac{1-2\mu }{2\alpha +2d+1}\left[-sr+\left(\alpha +d+1/2\right)\left(p-r\right)\right]}={\left(lnn\right)}^{\theta }{n}^{-\mu p}{n}^{r\left(\mu -\frac{1-2\mu }{2\alpha +2d+1}s\right)}.$

Since $\mu =\frac{s}{2s+2\alpha +2d+1}$ for $\omega \le 0$, $\mu -\frac{1-2\mu }{2\alpha +2d+1}s=0$ and ${A}_{n}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}$. Then one obtains the desired inequality (28).

When $\omega >0$, $r<\frac{\left(2\alpha +2d+1\right)p}{2s+2\alpha +2d+1}$ and $\mu =\frac{{s}^{\prime }}{2\left(s-1/r\right)+2\alpha +2d+1}$. Take

${p}_{1}=\frac{2\alpha p+2dp+p-2}{2\left(s-\frac{1}{r}\right)+2\alpha +2d+1}.$

Then $\frac{p-{p}_{1}}{2}=\mu p$ and $r<{p}_{1}$ in that case. With ${\stackrel{ˆ}{f}}_{j,k}=\left({\stackrel{ˆ}{\beta }}_{j,k}-{\beta }_{j,k}\right)I\left\{k\in \stackrel{ˆ}{{B}_{j}}\cap {B}_{j}\right\}$, one knows from Lemma 2.1 and the definition of ${B}_{j}$ that

$\sum _{k\in {I}_{j}}E{|{\stackrel{ˆ}{f}}_{j,k}|}^{p}\le {n}^{-\frac{p}{2}}{2}^{jp\left(\alpha +d\right)}\sum _{k\in {B}_{j}}{|\frac{2{\beta }_{j,k}}{T}{2}^{-j\left(\alpha +d\right)}\sqrt{\frac{n}{j}}|}^{{p}_{1}}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p-{p}_{1}}{2}}{2}^{j\left(p-{p}_{1}\right)\left(\alpha +d\right)}{\parallel {\beta }_{j,\cdot }\parallel }_{{p}_{1}}^{{p}_{1}}.$

Since $r<{p}_{1}$, ${\parallel {\beta }_{j,\cdot }\parallel }_{{p}_{1}}\le {\parallel {\beta }_{j,\cdot }\parallel }_{r}$. This with (25) leads to

$E{\parallel {e}_{bb}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p-{p}_{1}}{2}}\underset{{j}_{0}\le j\le {j}_{1}}{sup}{\left[{2}^{j\left(\frac{p-2+2\alpha p+2dp}{2{p}_{1}}\right)}{2}^{-j\left(\alpha +d\right)}{\parallel {\beta }_{j,\cdot }\parallel }_{r}\right]}^{{p}_{1}}.$

Note that ${2}^{j\frac{p-2+2\alpha p+2dp}{2{p}_{1}}}{2}^{-j\left(\alpha +d\right)}={2}^{j\left(s-\frac{1}{r}+\frac{1}{2}\right)}$ due to the definition of ${p}_{1}$, as well as ${f}_{X}^{\left(d\right)}\in {B}_{r,q}^{s}\left(a,b,L\right)$ implies ${2}^{j\left(s-\frac{1}{r}+\frac{1}{2}\right)}{\parallel {\beta }_{j,\cdot }\parallel }_{r}\lesssim {\parallel {f}_{X}^{\left(d\right)}\parallel }_{{B}_{r,q}^{s}}$. Then

$E{\parallel {e}_{bb}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p-{p}_{1}}{2}}={\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(29)

Finally, one estimates $E{\parallel {e}_{ss}\parallel }_{p}^{p}$: Define ${\stackrel{ˆ}{f}}_{j,k}:={\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }\right\}$. Then

$\sum _{k\in {I}_{j}}{|{\stackrel{ˆ}{f}}_{j,k}|}^{p}\le \sum _{k\in {S}_{j}^{\prime }}{|{\beta }_{j,k}|}^{p-r}{|{\beta }_{j,k}|}^{r}\le {\left(2T{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}\right)}^{p-r}{2}^{-j\left(s+\frac{1}{2}-\frac{1}{r}\right)r}{\parallel {f}_{X}^{\left(d\right)}\parallel }_{{B}_{r,q}^{s}}^{r}$

due to $r\le p$ and the definition of ${S}_{j}^{\prime }$. Using (25) and $\omega :=-sr+\left(\alpha +d+\frac{1}{2}\right)\left(p-r\right)$ in (19), one obtains

$E{\parallel {e}_{ss}\parallel }_{p}^{p}=E{\parallel \sum _{j={j}_{0}}^{{j}_{1}}\sum _{k\in {I}_{j}}{\stackrel{ˆ}{f}}_{j,k}{\psi }_{j,k}\parallel }_{p}^{p}=\left\{\begin{array}{ll}{\left(lnn\right)}^{\theta }{2}^{{j}_{0}\omega }{n}^{-\frac{p-r}{2}},& \omega \le 0,\\ {\left(lnn\right)}^{\theta }{2}^{{j}_{1}\omega }{n}^{-\frac{p-r}{2}},& \omega >0.\end{array}$
(30)

When $\omega \le 0$, $E{\parallel {\sum }_{j={j}_{0}\left(s,r,q\right)}^{{j}_{1}}{\sum }_{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }\right\}{\psi }_{j,k}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p-r}{2}}{2}^{{j}_{0}\left(s,r,q\right)\omega }$. Recall that $\omega \le 0$ holds if and only if $\mu =\frac{s}{2s+2\alpha +2d+1}$ and ${2}^{{j}_{0}\left(s,r,q\right)}\sim {n}^{\frac{1-2\mu }{2\alpha +2d+1}}$. Then it can be checked that ${2}^{{j}_{0}\left(s,r,q\right)\omega }{n}^{-\frac{p-r}{2}}={n}^{-\mu p}$. Hence,

$E{\parallel \sum _{j={j}_{0}\left(s,r,q\right)}^{{j}_{1}}\sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }\right\}{\psi }_{j,k}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(31)

On the other hand, Lemma 1.1 tells that

$E{\parallel \sum _{j={j}_{0}}^{{j}_{0}\left(s,r,q\right)}\sum _{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }}{\beta }_{j,k}{\psi }_{j,k}\parallel }_{p}^{p}\lesssim \sum _{j={j}_{0}}^{{j}_{0}\left(s,r,q\right)}{2}^{j\left(\frac{p}{2}-1\right)}\sum _{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }}E{|{\beta }_{j,k}|}^{p}.$

Since $k\in {S}_{j}^{\prime }$, $|{\beta }_{j,k}|\le 2T{2}^{j\left(\alpha +d\right)}\sqrt{\frac{j}{n}}$ and ${\sum }_{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }}{|{\beta }_{j,k}|}^{p}\lesssim {\sum }_{k\in {I}_{j}}{2}^{jp\left(\alpha +d\right)}{\left(lnn\right)}^{\theta }{n}^{-\frac{p}{2}}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p}{2}}{2}^{j}{2}^{jp\left(\alpha +d\right)}$. Moreover,

$\begin{array}{r}E{\parallel \sum _{j={j}_{0}}^{{j}_{0}\left(s,r,q\right)}\sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}^{\prime }\right\}{\psi }_{j,k}\parallel }_{p}^{p}\\ \phantom{\rule{1em}{0ex}}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p}{2}}\underset{{j}_{0}\le j\le {j}_{0}\left(s,r,q\right)}{sup}{2}^{j\left(\frac{p}{2}-1\right)}{2}^{j}{2}^{jp\left(\alpha +d\right)}\\ \phantom{\rule{1em}{0ex}}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\frac{p}{2}}{2}^{{j}_{0}\left(s,r,q\right)\left(\alpha +d+\frac{1}{2}\right)p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p},\end{array}$
(32)

where the last inequality comes from the choice ${2}^{{j}_{0}\left(s,r,q\right)}\sim {n}^{\frac{1-2\mu }{2\alpha +2d+1}}$. Combining this with (31), one has, for $\omega \le 0$,

$E{\parallel {e}_{ss}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$
(33)

It remains to show $E{\parallel {e}_{ss}\parallel }_{p}^{p}\lesssim {n}^{-\mu p}$ for $\omega >0$. By Lemma 1.1,

$\begin{array}{rcl}{\parallel \sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}& \sim & {2}^{j\left(\frac{1}{2}-\frac{1}{p}\right)}{\left(\sum _{k\in {I}_{j}}{|{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}|}^{p}\right)}^{1/p}\\ \le & {2}^{j\left(\frac{1}{2}-\frac{1}{p}\right)}{\parallel {\beta }_{j,\cdot }\parallel }_{{l}_{p}}.\end{array}$

This with Lemma 1.2 shows that

$\begin{array}{r}{\left[\sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{\left({2}^{j{s}^{\prime }}{\parallel \sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}\right)}^{q}\right]}^{1/q}\\ \phantom{\rule{1em}{0ex}}\le {\left[\sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{\left({2}^{j\left({s}^{\prime }+\frac{1}{2}-\frac{1}{p}\right)}{\parallel {\beta }_{j,\cdot }\parallel }_{{l}_{p}}\right)}^{q}\right]}^{1/q}\le {\parallel {f}_{X}^{\left(d\right)}\parallel }_{{B}_{p,q}^{{s}^{\prime }}}.\end{array}$
(34)

When $q=1$,

$\begin{array}{r}\sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{\parallel \sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}\\ \phantom{\rule{1em}{0ex}}\lesssim \sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{2}^{j{s}^{\prime }}{\parallel \sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}{2}^{-{j}_{1}\left(s,r,q\right){s}^{\prime }}\lesssim {2}^{-{j}_{1}\left(s,p,q\right){s}^{\prime }}.\end{array}$

When $q=+\mathrm{\infty }$,

$\begin{array}{r}\sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{\parallel \sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}\\ \phantom{\rule{1em}{0ex}}\lesssim \sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{2}^{-j{s}^{\prime }}{\parallel {f}_{X}^{\left(d\right)}\parallel }_{{B}_{p,q}^{{s}^{\prime }}}\lesssim {\left(lnn\right)}^{\theta }{2}^{-{j}_{1}\left(s,p,q\right){s}^{\prime }}.\end{array}$

For $1, by the Hölder inequality,

$\begin{array}{r}\sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{\parallel \sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}\\ \phantom{\rule{1em}{0ex}}\lesssim {\left[\sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{2}^{-j{s}^{\prime }{q}^{\prime }}\right]}^{\frac{1}{{q}^{\prime }}}{\left[\sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}{\left({2}^{j{s}^{\prime }}{\parallel \sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}\right)}^{q}\right]}^{\frac{1}{q}}\\ \phantom{\rule{1em}{0ex}}\lesssim {2}^{-{j}_{1}\left(s,r,q\right){s}^{\prime }}.\end{array}$

Using (34) and the choice ${2}^{{j}_{1}\left(s,r,q\right)}\sim {n}^{\frac{\mu }{{s}^{\prime }}}$, one obtains

$E{\parallel \sum _{j={j}_{1}\left(s,r,q\right)}^{{j}_{1}}\sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{n}^{-\mu p}.$

On the other hand,

$E{\parallel \sum _{j={j}_{0}}^{{j}_{1}\left(s,r,q\right)}\sum _{k\in {I}_{j}}{\beta }_{j,k}I\left\{k\in {\stackrel{ˆ}{S}}_{j}\cap {S}_{j}\right\}{\psi }_{j,k}\parallel }_{p}^{p}\lesssim {\left(lnn\right)}^{\theta }{2}^{{j}_{1}\left(s,r,q\right)\omega }{n}^{-\frac{p-r}{2}}$

thanks to (30). According to the choice of ${2}^{{j}_{1}\left(s,r,q\right)}\sim {n}^{\frac{\mu }{{s}^{\prime }}}$ and

$\mu =\frac{s-1/r+1/p}{2\left(s-1/r\right)+2\alpha +2d+1}=\frac{{s}^{\prime }}{2\left(s-1/r\right)+2\alpha +2d+1}$

for $\omega >0$, one finds ${2}^{{j}_{1}\left(s,r,q\right)\omega }{n}^{-\frac{p-r}{2}}\lesssim {n}^{-\mu p}$ by direct computations. Hence, $E{\parallel {e}_{ss}\parallel }_{p}^{p}\lesssim {n}^{-\mu p}$ in each case, which is (33). Now, (24) follows from (26), (27), (29), and (33). The proof is done. □

Remark 3.1 We find easily from Theorem 2.1 and Theorem 3.1 that the nonlinear wavelet estimator converges faster than the linear one for $r\le p$. Moreover, the nonlinear estimator is adaptive, while the linear one is not.

Remark 3.2 This paper studies wavelet estimations of a density and its derivatives with Fourier-oscillating noises. The remaining problems include the optimality of the above estimations, numerical experiments as well as the corresponding regression problems. We shall investigate those problems in the future.

## References

1. Walter GG: Density estimation in the presence of noise. Stat. Probab. Lett. 1999, 41: 237–246. 10.1016/S0167-7152(98)00160-6

2. Pensky M, Vidakovic B: Adaptive wavelet estimator for nonparametric density deconvolution. Ann. Stat. 1999, 27: 2033–2053. 10.1214/aos/1017939249

3. Fan J: On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Stat. 1991, 19: 1257–1272. 10.1214/aos/1176348248

4. Fan J, Koo J: Wavelet deconvolution. IEEE Trans. Inf. Theory 2002, 48: 734–747. 10.1109/18.986021

5. Lounici K, Nickl R: Global uniform risk bounds for wavelet deconvolution estimators. Ann. Stat. 2011, 39: 201–231. 10.1214/10-AOS836

6. Li R, Liu Y: Wavelet optimal estimations for a density with some additive noises. Appl. Comput. Harmon. Anal. 2014, 36: 416–433. 10.1016/j.acha.2013.07.002

7. Sun, J, Morrison, H, Harding, P, Woodroofe, M: Density and mixture estimation from data with measurement errors. Technical report (2002)

8. Devroye L: Consistent deconvolution in density estimation. Can. J. Stat. 1989, 17: 235–239. 10.2307/3314852

9. Hall P, Meister A: A ridge-parameter approach to deconvolution. Ann. Stat. 2007, 35: 1535–1558. 10.1214/009053607000000028

10. Meister A: Deconvolution from Fourier-oscillating error densities under decay and smoothness restrictions. Inverse Probl. 2008, 24: 1–14.

11. Delaigle A, Meister A: Nonparametric function estimation under Fourier-oscillating noise. Stat. Sin. 2011, 21: 1065–1092. 10.5705/ss.2009.082

12. Donoho DL, Johnstone IM, Kerkyacharian G, Picard D: Density estimation by wavelet thresholding. Ann. Stat. 1996, 24: 508–539.

13. Hernández E, Weiss G: A First Course on Wavelets. CRC Press, Boca Raton; 1996.

14. Daubechies I: Ten Lectures on Wavelets. SIAM, Philadelphia; 1992.

15. Härdle W, Kerkyacharian G, Picard D, Tsybakov A Lecture Notes in Statist. In Wavelets, Approximation, and Statistical Applications. Springer, New York; 1998.

## Acknowledgements

This paper is supported by the National Natural Science Foundation of China (No. 11271038).

## Author information

Authors

### Corresponding author

Correspondence to Youming Liu.

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

All authors contributed equally and significantly in writing this article. All authors read and approved the final manuscript.

## Rights and permissions 