A difference-based approach in the partially linear model with dependent errors

We study asymptotic properties of estimators of parameter and non-parameter in a partially linear model in which errors are dependent. Using a difference-based and ordinary least square (DOLS) method, the estimator of an unknown parametric component is given and the asymptotic normality of the DOLS estimator is obtained. Meanwhile, the estimator of a nonparametric component is derived by the wavelet method, and asymptotic normality and the weak convergence rate of the wavelet estimator are discussed. Finally, the performance of the proposed estimator is evaluated by a simulation study.


Introduction
Consider the partially linear model (PLM) where the superscript T denotes the transpose, y i are scalar response variables, x i = (x i1 , . . . , x id ) T are explanatory variables, β is a d-dimensional column vector of the unknown parameter, f (·) is an unknown function, t i are deterministic with 0 ≤ t 1 ≤ · · · ≤ t n ≤ 1, and e i are random errors. PLM was first considered by Engle et al. [1], and now is one of the most widely used statistical models. It can be applied in almost every field, such as engineering, economics, medical sciences and ecology, etc. There are many authors (see [2][3][4][5][6][7][8]) concerned with various estimation methods to obtain estimators of the unknown parameters and nonparameters for partially linear model. Deep results such as asymptotic normality of estimators have been obtained.
In this paper, by a difference-based approach, we will use the ordinary least square and wavelet to investigate model (1). The differencing procedures provide a convenient means for introducing nonparametric techniques to practitioners in a way which parallels their knowledge of parametric techniques, and differencing procedures may easily be combined with other procedures. For example, Wang et al. [9] obtained a difference-based approach to the semiparametric partially linear model. Tabakan et al. [10] studied a difference-based ridge in partially linear model. Duran et al. [11] investigated the difference-based ridge and Liu type estimators in semiparametric regression models. Hu et al. [12] used a differencebased Huber Dutter estimator (DHD) to obtain the root variance σ and parametric β for partially linear model. Wu [13] constructed the restricted difference-based Liu estimator for the parametric component of partially linear model. However, in the majority of the previous work it is assumed that errors are independent. The asymptotic problem of difference-based estimators of partially linear model with dependent errors is in practice important. In this paper, we use a difference-based and ordinary least square method to study the partially linear model with dependent errors.
For the dependent errors e i we confine ourselves to negatively superadditive dependent (NSD) random variables. There are many applications of NSD random variables in multivariate statistical analysis; see [14][15][16][17][18][19][20][21][22][23]. Hence, it is meaningful to study the properties of NSD random variables. The formal definition of NSD random variables is the following.
for all x, y ∈ R n , where ∨ stands for componentwise maximum, and ∧ for componentwise minimum.
Definition 2 (Hu [25]) A sequence {e 1 , e 2 , . . . , e n } is said to be NSD if where Y 1 , Y 2 , . . . , Y n are independent with e i d = Y i for each i, and is a superadditive function such that the expectations in (2) exist. An infinite sequence {e n , n ≥ 1} of random variables is said to be NSD if {e 1 , e 2 , . . . , e n } is NSD for all n ≥ 1.
Throughout the paper we fix the following notations. β 0 is the true value of the unknown parameter β. Z is the set of integers, N is the set of natural numbers, R is the set of real numbers. Denote x + = max(x, 0), and x -= (-x) + . Let C 1 , C 2 , C 3 , C 4 are positive constants. For a sequence of random variables η n and a positive sequence d n , write η n = o(d n ) if η n /d n converges to 0 and η n = O(d n ) if η n /d n is bounded. We can similarly define the notations of o P and O P for stochastic convergence and stochastic bounded. Weak convergence of a distribution is denoted by H n D → H, and for random variables by Y n D → Y . x is the Euclidean norm of x, and x = max{k ∈ Z : k ≤ x}.
0 · · · · · · · · · · · · 0 d 0 d 1 d 2 · · · · · · 0 0 · · · · · · · · · · · · · · · 0 where the positive integer number m is the order of differencing and d 0 , d 1 , . . . , d m are differencing weights satisfying This differencing matrix is given by Yatchew [30]. Using the differencing matrix to model (1), we have From Yatchew [30], the application of differencing matrix D in model (1) can remove the nonparametric effect in large samples, so we will ignore the presence of Df . Thus, we can rewrite (4) as whereỸ = (ỹ 1 , . . . ,ỹ n-m ) T ,X = (x 1 , . . . ,x n-m ) T and n =X TX is nonsingular for large n, e = (ẽ 1 , . . . ,ẽ n-m ) T ,ỹ i = m q=0 d q y i+q ,x i = m q=0 d q x i+q ,ẽ i = m q=0 d q e i+q , i = 1, . . . , nm. As a usual regression model, the ordinary least square estimatorβ n of the unknown parameter β is given aŝ Then the estimator satisfies and hencê In the following, we use wavelet techniques to estimate f (·) ifβ n is known. Suppose that there exists a scaling function φ(·) in the Schwartz space S l and a multiresolution analysis {Vm} in the concomitant Hilbert space L 2 (R), with the reproducing kernel Em(t, s) given by Let A i = [s i-1 , s i ] denote intervals that partition [0, 1] with t i ∈ A i for 1 ≤ i ≤ n. Then the estimator of the nonparameter f (t) is given bŷ

Preliminary conditions and lemmas
In this section, we give the following conditions and lemmas which will be used to obtain the main results.
is a Lipschitz function of order 1 and has compact support, in addition to |φ(ξ ) - Remark 3.1 Condition (C1) is standard and often imposed in the estimator of partial linear models, once can refer to Zhao et al. [31]. Conditions (C2)-(C5) are used by Hu et al. [29]. Therefore, our conditions are very mild and can easily be satisfied. [25]) Suppose that {e 1 , e 2 , . . . , e n } is NSD.
be a sequence of NSD random variables with Ee n = 0 and E|e n | p < ∞ for each n ≥ 1. Then for all n ≥ 1, Lemma 3.3 Let p > 1. Let {e n , n ≥ 1} be a sequence of NSD random variables with Ee n = 0 and E|e n | p < ∞ for all n ≥ 1, and {c q , 0 ≤ q ≤ m} be a sequence of real constants. Then for all n ≥ 1, E|c q e i+q | p for 1 < p ≤ 2 (11) and, for p > 2, In the case 1 < p ≤ 2, it follows from Lemma 3.2 that Note that |c q | p = |c + q | p + |cq | p , the desired result (11) follows from (13) immediately. In the same way, we also have (12). The proof is completed. E|c q e i+q | p (14) and, for p > 2, provided the covariation on the right hand side exists, where {a i , 1 ≤ i ≤ n} is an array of real numbers.
Proof For a pair of random variables Z 1 = i∈A a i X i , Z 2 = j∈B a j X j , we have Denote by F(z 1 , z 2 ) the joint distribution functions of (Z 1 , Z 2 ), and F Z 1 (z 2 ), F Z 2 (z 2 ) the marginal distribution function of Z 1 , Z 2 , one gets this relation was established in Lehmann [32] for any two random variables Z 1 and Z 2 with Cov(Z 1 , Z 2 ) exist. Let f , g are complex valued function on R with derivatives f , g < ∞, then we have The proof is completed.
Proof Notice that the result is true for n = 1.  Cov(e i 1 +q1 , e i 2 +q2 ).
Hence, the result is true for n = 2. Moreover, suppose that (16) holds for n -1. By Lemma 3.4, we have, for n, Cov(e i j +q1 , e i k +q2 ), which completes the proof.

Main results and their proofs
provided that Var(d q e i+q ) ≤ (nm) -1 and hence we have l ≤ C 3 / √ ε n . If the number of the remainder term is not zero when the construction ends, then we put all the remainder terms into a block denoted by J l . By (7), we have Then to prove (17), it is enough to prove that Let u be an arbitrary d-dimensional column vector with u = 1, and set a i = u T τ -1 βx i . Then, by the Cramér-Wold device, to prove (21) it suffices to prove that Moreover, note that max 0≤q≤m |d q | ≤ 1 and max 1≤i≤n |a i | < ∞ by Condition (C1), then applying Lemma 3.3 with p = 2 we have E|d q e i+q | 2 I |d q e i+q | ≥ √ nmε n which follows from J P → 0 as n → ∞ by the Markov inequality. Therefore, to prove (22), it suffices to show that On the one hand, by the definition of τ 2 β , it is easy to show that Therefore by the above formula and (23), On the other hand, by Lemma 3.5 and (ii), we have Cov(e i+q1 , e j+q2 ) → 0 as n → ∞, which implies that the problem now is reduced to study the asymptotic behavior of independent and non-identically distribution random variables { i∈I j a iẽi }.
To complete the proof of (24), it is enough to show that random variables { i∈I j a iẽi } satisfies the condition of Lemma 3.7. Set By the definition of I j , E|d q e i+q | 2+δ since τ n → 1 and (i). Hence, by Lemma 3.7, (24) holds and the proof is completed.
provided that is a positive definite matrix.
Proof Since {e n , n ≥ 1} is a sequence of independent random variables, we have Cov(e i , e j ) = 0 if i = j and hence Cov(ẽ i ,ẽ j ) = 0 if |i -j| > m. It follows that from the conditions of Corollary 4.1, we see that τ 2 β is a positive definite matrix. Thus the result follows from (29).
where M n → ∞ in arbitrary slowly rate, and τm = 2 - Proof We can prove Theorem 4.2 by a similar argument to Theorem 3.2 of Hu et al. [12], so we omit the detail. where Em(t, s) ds from the proof of Theorem 3.2 in Hu et al. [12], we get I 1 = O P (n -1/2 ), I 2 = O P (n -γ )+O P (τm) and I 3 = O P (n -1/3 M n ), and it implies that I 3 ) and Then we should prove Let a ni = τ -1 t A i Em(t, s) ds, then, by Lemma 3.6 and (C5), max 1≤i≤n |a ni | → 0, and n i=1 a 2 ni = O(1), and condition (i) implies that {e n , n ≥ 1} is a uniformly integral family on L 2 , then, by Lemma 3.8 and (ii), we have The proof is completed.

A simulation example
In this section, we perform a simulation example to verify the accuracy of Theorem 4.1 and Theorem 4.3. Consider the partially linear model where x i = cos(2πt i ), f (t i ) = sin(2πt i ), β 0 = 5, t i = i/n, e i is NSD sequence and raised as follows. Let {e 1 , e 2 , . . . , e n } be a sequence of independent and identically distributed random variables with common probability mass function P(e 1 = 0) = 2P(e 1 = 1) = P(e 1 = 2) = 0.4. Then {e 1 , e 2 , . . . , e n } given S n = n is NSD by Theorem 3.1 in Hu [25], where S n = n i=1 e i .  Fig. 1 gives the QQ-plot of Mβ n -β 0 . Figure 1 shows that the distribution of Mβ n -β 0 can approximate N(0, 1) well even if the sample size are not large (n = 64). Comparison of Fig. 2 with Fig. 1 indicates that the distribution approximation for the larger sample size is much more accurate than that for the small one.
Choose the Daubechies scaling function 2 φ(t) as in Hu et al. [29]. Figures 3 and 4 show that the distribution of Mˆf n -f = τ -1 t (f n (t)f (t)) is closer and closer to N(0, 1) with the increasing sample size.

Conclusions
In this paper, we use a difference-based and ordinary least square (DOLS) method to obtain the estimator of the unknown parametric component β of the partial linear model with dependent errors. In addition, we investigate the asymptotic normality for the DOLS estimator of β and wavelet estimator of f (·). Thus, we extend some results of Hu et al. [12] to the partially linear model with NSD errors. Furthermore, NSD random variables contain negatively associated random variables. Therefore, it is an interesting subject to investigate the limit properties of the difference-based estimator for a partially linear model with NSD errors in future studies.