M-test in linear models with negatively superadditive dependent errors

This paper is concerned with the testing hypotheses of regression parameters in linear models in which errors are negatively superadditive dependent (NSD). A robust M-test base on M-criterion is proposed. The asymptotic distribution of the test statistic is obtained and the consistent estimates of the redundancy parameters involved in the asymptotic distribution are established. Finally, some Monte Carlo simulations are given to substantiate the stability of the parameter estimates and the power of the test, for various choices of M-methods, explanatory variables and different sample sizes.


Introduction
Consider the linear regression model: y t = x t β + e t , t = , . . . , n, where {y t } and {x t = (x t , x t , . . . , x tp ) } are real-valued responses and real-valued random vectors, respectively. The superscript represents the transpose throughout this paper, β = (β  , . . . , β p ) is a p-vector of the unknown parameter, and {e t } are random errors. It is well known that linear regression models have received much attentions for their immense applications in various areas such as engineering technology, economics and social sciences. Unfortunately, there exists the problem that the classical maximum likelihood estimator for these models is sufficiently sensitive to outliers. To overcome this defect, Huber proposed the M-estimate which possesses the robustness (see Huber []) by minimizing n t= ρ y t -x t β , where ρ is a convex function. It is obvious that many important estimates can be addressed easily. For instance, the least square (LS) estimate with ρ(x) = x  /, the least absolute deviation (LAD) estimate with ρ(x) = |x|, and the Huber estimate with ρ(x) = (x  I(|x| ≤ k))/ + (k|x|k  /)I(|x| > k), k > , where I(A)is the indicator function of A. Letβ n be a minimizer of () and consequentlyβ n is a M-estimate of β. Some excellent results as regards the asymptotic properties ofβ n with various forms of ρ have been reported in [-]. Most of the results rely on the independence errors. As Huber claimed in [], the independence assumption on the errors is a serious restriction. It is practically essential and imperative to explore the case of dependent errors, which is a theoretically challenging. Under the dependence assumption of the errors, Berlinet et al. [] proved the consistency of M-estimates for linear models with strong mixing errors. Cui et al. [] obtained the asymptotic distributions of M-estimates for linear models with spatially correlated errors. Wu [] investigated the weak and strong Bahadur representations of the M-estimates for linear models with stationary causal errors. Wu [] established the strong consistency of M-estimates for linear models with negatively dependent (NA) random errors.
In the following we will introduce a wide random sequence, NSD random sequence, whose definition based on the superadditive functions.

Definition  (Hu []) A function
for all x, y ∈ R n , where '∨' is for a componentwise maximum and '∧' is for a componentwise minimum.
Definition  (Hu []) A random vector (X  , X  , . . . , X n ) is said to be NSD if where {X * t , t = , . . . , n} are independent random variables such that have same marginal distribution with {X t , t = , . . . , n} for each t, and φ is a superadditive function such that the expectations in () exist.
The concept of NSD random variables, which generalizes the concept of NA, was proposed by Hu  The purpose of this paper is to investigate the M-test problem of the regression parameters in the model () with NSD random errors, we consider a test for the following hypothesis: where H is a known p × q matrix with the rank q ( < q ≤ p), b is a known p-vector. A sequence of the local alternatives is considered as follows: where ω n is a known p-vector such that Actually,β andβ are the M-estimates in the restricted and unrestricted model (), respectively. To test the hypothesis (), we adopt M-criterion which regards M n as the criterion to measure the level of departure from the null hypothesis. Several classical conclusions have been presented in [-] when the errors are assumed to be independence, we will generalize the case to NSD random errors. Throughout this paper, let C be a positive constant. Put |τ | = max ≤t≤p {|τ  |, |τ  |, . . . , |τ p |} if τ is a p-vector. Let x + = xI(x ≥ ) and x -= -xI(x < ). A random sequence {X n } is said to on L q -norm, q > , if E|X n | q < ∞. Denote a n = o P (b n ) if a n /b n converges to  in probability and a n = O P (b n ) if a n /b n converges to a constant in probability.
The rest of the paper is organized as follows. In Section , the asymptotic distribution of M n is obtained with the NSD random errors, and the consistence estimates of the redundancy parameters λ and σ  are constructed under the local hypothesis. Section  gives the theoretical proofs of main results. The simulations are presented to show the performances of parameter estimates and the M-test for the powers in Section , and the conclusions are given in Section .

Main results
In this paper, let ρ be a non-monotonic convex function on R, and denote ψ + and ψas the right and left derivatives of the function ρ, respectively. The derivative function ψ is chosen to satisfy ψ -(u) ≤ ψ(u) ≤ ψ + (u), for all u ∈ R. Now, several assumptions are listed as follows: (A) The function G(u) = Eψ(e t + u) exists with G() = Eψ(e t ) = , and has a positive derivative λ at u = . (A)  < Eψ  (e  ) = σ  < ∞, and lim u→ E|ψ(e  + u)ψ(e  )|  = . (A) There exists a positive constant such that for h ∈ (, ), the function assume that S n >  for sufficiently large n, and Remark  (A)-(A) are often applied in the asymptotic theory of M-estimate in regression models (see [-]). (A) is reasonable because it is equivalent to the bound of max ≤t≤n |x t x T t |, and here is a particular case of the condition d n = O(n -δ ) for some  < δ ≤ , which was used in Wang et al. []. Those functions were mentioned in () whose 'derivative' function ψ correspond to least square (LS) estimate with ψ(x) = x, least absolute deviation (LAD) estimate with ψ(x) = sign(x) and Huber estimate with ψ(x) = -kI(x < -k) + xI(|x| ≤ k) + kI(x > k) are satisfied with the above conditions.
Theorem  In the model (), assume that {e t ,  ≤ t ≤ n}, which is a sequence of identically distributed NSD random variables, is an uniformly integral family on L-norm, and (A)-(A) hold. Then λσ - M n has an asymptotic non-central chi-squared distribution with p-degrees of freedom and a non-central parameter v(n), namely, where v(n) = λ  σ - ω(n)  , ω(n) = H n S / n ω n , H n = S -/ n H(H S - n H) -/ . In particular, when the local alternatives S / n ω n → , which means that the true parameters deviate from the null hypothesis slightly, then λσ - M n has an asymptotic central chi-squared distribution with p degrees of freedom For a given significance level α, we can determine the rejection region as follows: where h = h n > , and h n is a sequence chosen to satisfy Under the conditions of Theorem , we havê Under the assumption S / n ω n → , replacing λ, σ  by their consistent estimatesλ n and σ  n , then

Proof of theorems
It is convenient to consider the rescaled model Assume that q < p, there exists a p × (pq) matrix K with the rank (pq) such that H K =  and K ω n = , then, for some γ ∈ R p-q , H  and H ,n can be written as Under the null hypothesis, model () can be rewritten as Set ω(n) = H n S / n ω n , γ (n) = γ  (n) + K n S / n ω n , under the local alternatives (), Obviously,β(n),γ (n) are the M-estimates of β(n) and γ (n), respectively. Thus Next, we will state some lemmas that are needed in order to prove the main results of this paper.

If f is a real function on D, then f is also convex, and for all compact subset D
Moreover, if f is a differentiable function on D, g(x) and g n (x) represent the gradient and sub-gradient of f , respectively, then Denote by F(x, y) the joint distribution functions of (X, Y ), and F X (x), F Y (y) the marginal distribution function of X, Y , one gets thus for any real numbers r, u Next, we proceed the proof by induction on n. Lemma  for n =  is trivial and for n =  is true by (). Assume that the result is true for all n ≤ M (n ≥ ). For n = M + , there exist Note that X, Y are non-decreasing functions, we have by the induction hypothesis that We define s  = , s j+ = min{s : s > s j , s ∈ ϒ j }, and put Now taking an independence random sequence {Z * nj , j = , , . . . , n} such that have same marginal distribution with Z nj for each j. Let F(Z n , Z n , . . . , Z nn ) and G(Z * n , Z * n , . . . , Z * nn ) be the eigenfunctions of n j= Z nj and n j= Z * nj , respectively. Choosing r = max{r l , r j }, we have by Lemma  and () By Levy's theorem, Z * nj obtains the asymptotic normality, applying Lemma , then the identically distribution property of {X j } implies that In view of the monotonicity of ψ(e t + u)ψ(e t ), the summands of f n (τ ) is also monotonous with respect to e t from the property (b) in Lemma . We divide the summands of f n (τ ) into positive and negative two parts, by the property (c) in Lemma , they are still NSD. Therefore, applying Schwarz's inequality and (), we obtain Hence for sufficiently large n, we have Lemma  is proved by () and Lemma .
Lemma  Under conditions of Lemma  and the local alternatives ()-(), then we see that, for any positive constant δ and sufficiently large n, where ξ  = ηβ(n), ξ  = ζτ (n).
Proof Take the proofs of () and () as examples, the rest, equations () and (), are the same. Note that () can be written as On the other hand, S / n ω n = O() and γ  ≤ δ, hence there exists a constant δ  such that Thus () and () follow naturally by () and Lemma .

Lemma  Under the conditions of Theorem , as n → ∞, we havê
Proof The estimate of () can be defined essentially as the solution of the following equation: By a routine argument, we shall prove that Let U be a denumerable dense subset in the unit sphere of R p such that Obviously, for a given τ , D(·, L) is non-decreasing on L since ψ is non-decreasing. For any ε > , let Thus by (), there exists a number n  , as n ≥ n  , Note that D(τ , ·) is a non-decreasing function on τ for given L, then Based on Lemma  and max ≤t≤n |x nt τ | = O(n -/ ), one can see that On the other hand, by Schwarz's inequality, we have Combining () and (), there exists n  (n  ≤ n  ≤ n) such that Applying Chebyshev's inequality, the C r inequality and Lemma , we have Consequently, () is proved. As defined in (),β(n) can similarly be written asβ(n) = K nγ (n) + H n ω(n), replacing β(n),β(n) by γ (n) andγ (n), respectively, () is proved by K n K n = I p-q .
Proof of Theorem  According to () and Lemma , one gets Similarly, In the view of H n H n = I q and Lemma , Thus Theorem  follows immediately from () and ().
Proof of Theorem  Consider the model (), without loss of generality, assume that the true parameter β(n) is equal to . For any δ > , write By the monotonicity of ψ, Schwarz's inequality and (A), we get, for sufficiently large n, By Lemma , Consequently, () is proved. As mentioned in Chen et al. [], in order to prove (), it is desired to prove that Actually, by the monotonicity of ψ(e t + h)ψ(e th), and the assumption (), applying Lemma  and Lemma , we get On the other hand, since lim n→∞ [G(h) -G(-h)]/(h) = λ, This completes the proof of Theorem .

Simulation
We evaluate the parameter estimates and the M-test for the powers by Monte Carlo techniques. Under the null hypothesis, the estimators of regression coefficients and redundancy parameters are derived by some M-methods such as LS method, LAD method and Huber method. Under the local alternative hypothesis, the powers of the M-test is obtained with the rejection region given by Theorem . In this section, the case of the NSD sequence is raised as follows: where a n and b n are positive sequences, Y t and Z t are negatively dependent (correspond to ρ  < ) random variables with the distribution Now, we will prove that (X  , X  , . . . , X n ) is a NSD sequence. Obviously, one may easily to check that cov(X t , X j ) < ,  ≤ t < j ≤ n.
As stated in Hu [], the superaddictivity of φ is equivalent to ∂  φ/∂x t ∂x j ≥ ,  ≤ t = j ≤ n, if the function φ has continuous second partial derivatives. In which, φ(x  , . . . , x n ) = exp ( n t= X t )  can be chosen as a superadditive function. Note that the {X * t ,  ≤ t ≤ n} have same marginal distribution with {X t , t = , . . . , n} for each t, by Jensen's inequality, the sequence (X  , X  , . . . , X n ) is proved to be NSD since Throughout the simulations, the Huber function is taken to be ρ(x) = (x  I(|x| ≤ k))/ + (k|x|k  /)I(|x| > k), k = .σ  . The explanatory variables are generated from two random models and all of the simulations are run for , replicates and calculate the averages of the derived estimates to avoid the randomness impact. The linear model with NSD errors is given by y t = β  + β  x t + e t , e t = Y t + Z t , t = , , . . . , n, where the NSD errors {e t , t = , , . . . , n} are assumed to follow a multivariate mixture of normal distribution with joint distribution (Y , Z) ∼ N(μ  , μ  , σ   , σ   , ρ  ), ρ  < . The null hypothesis is H  : (β  , β  ) = (, ) . The sample size is taken to be n = , n = , n = ,. The joint distribution is taken to be (Y , Z) ∼ N(, , , , -.). The explanatory variables x t are generated by the following two random models: I. x t = u t ,  ≤ t ≤ n; II. x t = sin(t) + .u t ,  ≤ t ≤ n, where u obeys a standard uniform distribution U(, ).
Firstly, we generate a NSD sequence by the Gibbs sampling technique. Figure  shows the fitted distribution (full line) of NSD is close to the normal distribution, relatively speaking, the NSD distribution tends to behave a truncated distribution feature.
Next, we evaluate the estimators of regression coefficients and redundancy parameters under the null hypothesis, Table  illustrates that the M-methods are valid (the corresponded M-estimates are close to true parameters β  = , β  = ) and the estimators of redundancy parameters are effective (one may easily to check that σ  =  and λ =  when the convex function is taken to LS function, for other estimates, although their values are different based on different methods, the sign and significance remain the same, so the general conclusions remain the same). Additionally, with the increasing sample size, the estimations are more and more accurate. In fact, the estimations behave well though the sample size is not large (n = ). As excepted, the fitted residual densities are close  to the assumed NSD errors in Figure , and all of them still show a truncated distribution feature. Figure  checks the residuals are NSD by using the empirical distribution to approximate the distribution function, which supports the NSD errors assumption. Finally, we study the empirical significant levels and the powers of M-test. Under the local hypothesis, λ nσ  n M n has an asymptotic central chi-squared distribution with two degrees of freedom by Theorem  and Theorem , we may reject the null hypothesis if the simulative value λ nσ  n M n ∈ W in (). Table  presents the powers at significance levels α = . and α = . for various choices of M-methods, explanatory variables and different sample sizes n = , n = , n = ,. The result represents that the empirical significant levels are close to the nominal levels, consequently, the M-test is valid. Figure 

Conclusions
The results presented here generalize conclusions in [-]. In the simulations it turns out that the M-tests for the linear model with NSD errors are insensitive to different choices of M-methods and explanatory variables, therefore it shows robustness, which illustrates that the M-test is effective.