In this section, we provide some necessary concepts about the general framework of this paper. Section 2.1 presents preliminary assumptions and some notations about copula functions. All the major definitions and facts about the wavelets used in the paper are presented in Sect. 2.2. At the end of the section we introduce the estimators for the main result.
2.1 Copula function
Copula had been first established in [22] and improved very fast in many spaces. Here we give a brief definition of a copula function in the two-dimensional case. Any extension of the results to higher dimensions is straightforward. Consider a random vector \((T_{1},T_{2})\) with cumulative distribution function \(F(t_{1},t_{2})= P(T_{1}\leq t_{1},T_{2}\leq t_{2})\) and margin distribution functions \(F_{1}\) and \(F_{2}\). The relation between these variables is of interest, so the copula function C can be formed as follows:
$$\begin{aligned} F(t_{1},t_{2})= C\bigl(F_{1}(t_{1}),F_{2}(t_{2}) \bigr). \end{aligned}$$
The marginals \(F_{1}\) and \(F_{2}\) have uniform distribution on (0,1), and if they are continuous, then C is unique and coincides with the distribution function of the pair \((U,V)=(F_{1}(T_{1}),F_{2}(T_{2}))\). In practice, F is unknown. The advantage of using copula is that the joint distribution function (\(F(t_{1},t_{2})\)) can be constructed by using marginals (\(F_{1}\) and \(F_{2}\)) when they are from different classes of distributions. Let \((T_{11},T_{21}),\ldots,(T_{1n},T_{2n})\) be a random sample from the unknown distribution F. Denote by \(F_{1n}\) and \(F_{2n}\) the empirical distributions associated with \(F_{1}\) and \(F_{2}\). A first step in selecting an appropriate class of copulas consists of plotting the pairs \((\frac{R_{i}}{n}, \frac{S_{i}}{n}) = (F_{1n}(T_{1i}),F_{2n}(T_{2i}))\), \(i=\{1,\ldots,n\}\).
Here \(R_{i}\) is the rank of \(T_{1i}\) among \(T_{11},\ldots,T_{1n}\), and \(S_{i}\) is the rank of \(T_{2i}\) among \(T_{21},\ldots,T_{2n}\). The motivation behind this approach is that the pseudo-observations \((R_{i}/n,S_{i}/n)\) are close substitutes to the unobservable pairs \((U_{i},V_{i})= (F_{1} (T_{1i}),F_{2} (T_{2i}))\) forming a random sample from C. We denote by \(c(u,v)\) the density of \(C(u,v)\),
$$\begin{aligned} c(u,v)= \frac{\partial ^{2} C(u,v)}{\partial u \partial v}, \quad u,v\in (0,1). \end{aligned}$$
It is obvious that in real analysis one of our variables (or even both of them) may be subject to be censored, and one observes a minimum between it and another (censoring) random variable, denoted by \(C_{j}\), so \(Y_{j} = \min (T_{j},C_{j})\) and \(\delta _{j} = I_{T_{j} \leq C_{j}}\) for \(j=1,2\). The i.i.d. replication vector \((Y_{1i},Y _{2i},\delta _{1i},\delta _{2i})_{1\leq i\leq n}\) denotes the random sample of \((Y_{1},Y_{2},\delta _{1},\delta _{2})\). Clearly, many different estimators can be considered for a distribution function based on censoring. The one used in this paper has the form
$$\begin{aligned} \hat{F}_{n}(t_{1},t_{2}) = \frac{1}{n} \sum _{i=1}^{n} W_{in}^{(\cdot )} I (Y _{1i}\leq t_{1},T_{2i} \leq t_{2}), \end{aligned}$$
where \(W_{in}\) are random weights designed to compensate asymptotically the bias caused by censoring and can be assumed in three different forms; see [15]. The weight considered in the present paper is
$$\begin{aligned} W_{in} = \frac{\delta _{1i}}{1-\hat{G}(Y_{1i}^{-})} . \end{aligned}$$
In this form only \(T_{1}\) is assumed to be censored, and then \(Y_{2} = T_{2}\), \(\delta _{2} = 1\) a.s., and \(C_{1}\) is independent from \(T_{1}\). Also Ĝ, the Kaplan–Meier estimator of the censoring variable, is defined as \(\hat{G}(t) = 1- \prod_{i:Y_{1i}\leq t} (1- \frac{1}{ \sum_{j=1}^{n}I(Y_{1j}\geq Y_{1i})})^{1-\delta _{1i}}\). Introducing \(G(t)= P(C_{1}\leq t)\), the weight \(W_{in}\) can be seen as an approximation of \(W_{i} = \frac{\delta _{1i}}{1-G(Y_{1i}^{-})} \). The other weights are discussed in [15]. They took \(W_{in}= \delta _{1i} \hat{g}(Y_{1i})\), where ĝ is a consistent estimator of limit function g, estimated from the data, where g satisfies the condition \(E[\delta _{1} g(Y_{1}) \phi (Y_{1},T_{2})] = E[ \phi (T_{1},T_{2})]\) for all \(\phi \in L^{1}\).
2.2 Wavelets
Wavelets and their applications are still an important subject in statistics. The term wavelet is used to refer to a set of orthonormal basis functions generated by dilation and translation of a compactly supported scaling function (father wavelet) ϕ and a mother wavelet ψ associated with an r-regular (\(r > 0\)) multiresolution analysis of \(L_{2}(\mathbb{R})\), the space of square-integrable functions on the line. Define \(\phi _{j,k}(x)= 2^{j/2} \phi (2^{j}x-k)\) and \(\psi _{j,k}(x)= 2^{j/2} \psi (2^{j}x-k)\) for \(j\in \mathbb{N}\) and \(k=(k_{1},k_{2})\in {\mathbb{Z}}^{2}\). It is assumed that \(j\geq j _{o}\) for some coarse scale \(j_{o}\in \mathbb{N}\), which we take as l. We suppose that ϕ and ψ are bounded and compactly supported. For more on wavelets, see [8] and [18]. The wavelet expansion for \(f(x,y)\) can be written as
$$\begin{aligned} f(x,y)=f_{j_{0}}(x,y)+D_{j_{0}}f(x,y), \quad x,y \in R, \end{aligned}$$
(1)
where \(f_{j_{0}}(x,y)= \sum_{k \in \mathbb{Z}^{2}}\alpha _{j_{0}k} \phi _{j_{0}k}(x,y)\) is a trend of an approximation, and
$$\begin{aligned} D_{j_{0}}f(x,y)= \sum_{j=j_{0}}^{\infty } \biggl(\sum_{k\in \mathbb{Z}^{2}} \beta _{j_{0}k}^{(1)} \psi _{j_{0}k}^{(1)}(x,y)+\sum_{k\in \mathbb{Z} ^{2}} \beta _{j_{0}k}^{(2)}\psi _{j_{0}k}^{(2)}(x,y)+ \sum _{k\in \mathbb{Z}^{2}}\beta _{j_{0}k}^{(3)}\psi _{j_{0}k}^{(3)}(x,y)\biggr). \end{aligned}$$
For more details about \(D_{j_{0}} f(x,y)\) and the functions \(\psi _{j_{0}k}\), see [14]. The coefficients \(\alpha _{j _{0} k}\) and \(\beta _{j_{0} k}^{(1)}\), \(\beta _{j_{0} k}^{(2)}\), \(\beta _{j_{0} k}^{(3)}\) are unique for every \(j_{0} \in \mathbb{N} \). The function \(\phi _{j_{0} k}\) is defined as \(\phi _{j_{0} k_{1} k_{2}} (x,y) = \phi _{j_{0} k_{1}} (x)\phi _{j_{0} k_{2}} (y)\).
Some important cases of wavelets are the Haar, Daubechies, Shannon, Meyer, and Morlet wavelets (see [8]). We used the Haar and Daubechies wavelets in our simulation studies. Accordingly, by equation (1) the copula density c can be expanded with
$$\begin{aligned} \alpha _{j_{0}k} = \int _{{(0,1)}^{2}} c(u,v) \phi _{j_{0}k}(u,v) \,du \,dv, \quad k \in \mathbb{Z}^{2}. \end{aligned}$$
By the change of variables \(u=F_{1}(t_{1})\) and \(v=F_{2}(t_{2})\) we get
$$\begin{aligned} \alpha _{j_{0}k} = \int \phi _{j_{0}k}\bigl(F_{1}(t_{1}),F_{2}(t_{2}) \bigr) f(t _{1},t_{2}) \,dt_{1} \,dt_{2} = E_{f} \bigl(\phi _{j_{0}k}\bigl(F_{1}(T_{1}),F_{2}(T _{2})\bigr)\bigr). \end{aligned}$$
(2)
Assuming that \(I=I(Y_{1i}\leq T,T_{2i}\leq T)\), when \(F_{1}\) and \(F_{2}\) (the marginal distributions) are known, a moment-based estimator of \(\alpha _{j_{0}k}\) based on censored data is then given by
$$\begin{aligned} \hat{\alpha }_{j_{0} k} =& \int I \phi _{j_{0}k} \bigl(F_{1}(y_{1}),F_{2}(t _{2})\bigr) \hat{F}(dy_{1},\,dt_{2}) \\ =& \frac{1}{n} \sum_{i=1}^{n} W_{in}^{(\cdot )} I(Y_{1i}\leq t_{1},T _{2i}\leq t_{2}) \phi _{j_{0}k} \bigl(F_{1}(Y_{1i}),F_{2}(T_{2i}) \bigr). \end{aligned}$$
(3)
Then the wavelet-based estimator of c is given by
$$ \hat{c_{j_{0}}}(u,v) = \sum_{k\in \mathbb{Z}^{2}} \hat{\alpha }_{j _{0}k} \phi _{j_{0}k}(u,v),\quad u,v\in (0,1). $$
When \(F_{1}\) and \(F_{2}\) are unknown, their empirical distribution functions \(F_{1n}\) and \(F_{2n}\) are used. So the rank-based estimator is as follows:
$$\begin{aligned} \tilde{\alpha }_{j_{0}k} =& \int I \phi _{j_{0}k} \bigl(F_{1n}(y_{1}),F _{2n}(t_{2})\bigr) \hat{F}(dy_{1}, \,dt_{2}) \\ =& \frac{1}{n} \sum_{i=1}^{n} W_{in}^{(\cdot )} I(Y_{1i}\leq t_{1},T_{2i} \leq t_{2}) \phi _{j_{0}k} \bigl(F_{1n}(Y_{1i}),F_{2n}(T_{2i}) \bigr). \end{aligned}$$
(4)
Now we can introduce the linear wavelet-based estimator of c based on the ranks as
$$\begin{aligned} \tilde{c}_{j_{0}} (u,v) = \sum _{k\in \mathbb{Z}^{2}} \tilde{\alpha } _{j_{0}k} \phi _{j_{0}k}(u,v),\quad u,v\in (0,1). \end{aligned}$$
(5)
When we have no censoring, the definition reduces to that of the copula estimator introduced in [9]. We further denote by K any constant that may change from one line to another, does not depend on j, k, n, but depends on the wavelet basis and on \(\Vert c \Vert _{\infty } = \sup_{(u,v)\in (0,1)} \vert c(u,v) \vert \) and \(\Vert c \Vert _{2}= \int c(u,v)^{2} \,du\,dv\).
Since we deal with the wavelet method, it is very common to consider Besov spaces as functional spaces because they are characterized in terms of wavelet coefficients as follows. Besov spaces depend on three parameters \(s > 0\), \(1< p<\infty \), and \(1< q<\infty \) and are denoted by \(B^{s}_{pq}\). Let \(f\in L_{2}({\mathbb{R}}^{d})\) (that is, \(L_{2}( {\mathbb{R}}^{2})\) in our paper), and let \(s< r\) (wavelet regularity). Define the sequence norm of the wavelet coefficients of a function \(f\in B^{s}_{pq}\) by
$$\begin{aligned} \vert f_{B^{s}_{pq}} \vert = {\biggl(\sum_{k\in \mathbb{Z}^{2}} \vert \alpha _{j_{0}k} \vert ^{p}\biggr)} ^{1/p}+ { \biggl(\sum_{j\geq j_{0}}\biggl[2^{j_{0}(s+d(1/2-1/p))} \biggl( \sum _{k\in \mathbb{Z}^{2}} \vert \beta _{j,k} \vert ^{p}\biggr)^{1/p}\biggr]^{q}\biggr)}^{1/q}, \end{aligned}$$
where \(( \vert \beta _{j,k} \vert ^{p})^{1/p} = (\sum_{k\in \mathbb{Z}^{2}} \sum_{\epsilon \in S_{2}} \vert \beta _{j,k}^{\epsilon } \vert ^{p})^{1/p}\). We assume that the copula function c belongs to a Besov space.