Strong consistency rates for the estimators in a heteroscedastic EV model with missing responses

This article is concerned with the semi-parametric error-in-variables (EV) model with missing responses: yi = ξiβ + g(ti) + i , xi = ξi +μi , where i = σiei is heteroscedastic, f (ui) = σ 2 i , yi are the response variables missing at random, the design points (ξi , ti ,ui) are known and non-random, β is an unknown parameter, g(·) and f (·) are functions defined on closed interval [0, 1], and the ξi are the potential variables observed with measurement errors μi , ei are random errors. Under appropriate conditions, we study the strong consistent rates for the estimators of β , g(·) and f (·). Finite sample behavior of the estimators is investigated via simulations.


Introduction
Consider the following semi-parametric error-in-variables (EV) model: (1.1) where i = σ i e i , σ 2 i = f (u i ), y i are the response variables, (ξ i , t i , u i ) are design points, ξ i are the potential variables observed with measurement errors μ i , Eμ i = 0, e i are random errors with Ee i = 0 and Ee 2 i = 1, β is an unknown parameter, g(·) and f (·) are functions defined on closed interval [0, 1]. In model (1.1), there exists a function h(·) defined on closed interval [0, 1] satisfying study the strong consistent rates for the estimators of β and g (·), according to f (·) being known or unknown.
The paper is organized as follows. In Sect. 2, we list some assumptions. The main results are given in Sect. 3. Simulation study is presented in Sect.s 4. Some preliminary lemmas are stated in Sect. 5. Proofs of the main results are provided in Sect. 6.

Assumptions
In this section, we list some assumptions, which will be used in the theorems below.

Estimation without considering heteroscedasticity
For model (1.1) without heteroscedasticity, firstly, one deletes all the missing data. Then one can get the model δ i y i = δ i ξ i β +δ i g(t i )+δ i i . If ξ i can be observed, we can apply the leastsquare estimation method to estimate the parameter β. If β is known, using the complete data (δ i y i , δ i x i , δ i t i ), 1 ≤ i ≤ n, the estimator of g(·), given β, is Then under this condition of the semi-parametric EV model, Liang et al. [11] improved the least-square estimator (LSE) on the basis of the usual partially linear model, and employed the estimator of parameter β to minimize the following formula: Therefore, one can get the LSE of β Usingβ c , we define the following the estimators of g(·): Apparently, the estimators β c and g c n (t) are formed without taking all sample information into consideration. Hence, in order to make up for the missing data, we imply an imputation method from Wang and Sun [17], and let Therefore, using complete data (U I i , x i , t i ), 1 ≤ i ≤ n, similar to (3.2) and (3.3), one can get other estimators for β and g(·), that is, Usingβ I , we define the following the estimators of g(·):

Estimation when σ
When the errors are heteroscedastic, we consider two different cases according to f (·). If σ 2 i = f (u i ) are known, thenβ is modified to be the weighted least-square estimator (WLSE) Usingβ W 1 , we define the following the estimators of g(·): Then similar to (3.4) one can make up for the missing data and let Therefore, using complete data (U I 1 i , x i , t i ), 1 ≤ i ≤ n, similar to (3.4)-(3.5), one can get other estimators for β and g(·), that is, Usingβ I 1 , we define the following the estimators of g(·): x j . Therefore, we have the following results.

Estimation when σ
We address the case that the σ 2 i = f (u i ) are unknown and must be estimated. Note that, when For the sake of convenience, we assume that min 1≤i≤nfn (u i ) > 0. Then we can define a nonparametric estimator of σ 2 i ,σ 2 ni =f n (u i ). Consequently, the WLSE of β iŝ Usingβ W 2 , we define the following estimators of g(·): (3.14) Similarly, one can make up for the missing data and let Therefore, using complete data (U I 2 i , x i , t i ), 1 ≤ i ≤ n, one can get other estimators for β and g(·), that is, Usingβ I 2 , we define the following the estimators of g(·): . Therefore, we have the following results. .

Simulation study
In this section, we carry out a simulation to study the finite sample performance of the proposed estimators. In particular: (1) we compare the performance of the estimatorsβ W 1 ,β I 1 ,β W 2 andβ I 2 by their mean squared errors (MSE), also, we compare the performance of the estimatorsĝ W 1 n (·), g I 1 n (·),ĝ W 2 n (·) andĝ I 2 n (·) by their global mean squared errors (GMSE); (2) we give the boxplots for the estimators of β and g(t n/2 ); (3) we give the fitting figure for the estimators of g(·). Observations are generated from where is K(·) a Gaussian kernel function, h n , b n , l n are bandwidth sequences.

Compare the estimators for β and g(·)
Because otherwise there would be too much computation, we have to take a small sample size for convenience of the simulation. We generate the observed data with sample size n=50, 100 and 200 from the model above. The MSE of the estimators for β based on M = 100 replications are defined as whereβ(l) is the lth estimator of β. The GMSE of the estimators for g(·) are defined as It is well known that an important issue is the selection of an appropriate bandwidth sequence. The common methods are grid point and cross-validation. Here we use the grid point method to select optimal bandwidths. The bandwidth sequences h n , b n , l n are taken uniformly over 50 points with step length of 0.02 on the closed interval [0, 1]. Then we calculate the MSE for the estimators of β and the GMSE for the estimators g(·) for each (h n , b n , l n ) and select optimal bandwidths to minimize the MSE for the estimators of β and the GMSE for the estimators g(·). The MSE or GMSE for the estimators are reported in Tables 1-2. On the other hand, we give the boxplots for the estimators of β and g(t n/2 ) with n = 50, 100, 200 and p = 0.25.
From Tables 1-2 and Fig. 1, it can be seen that:  (i) For every fixed n and p, the MSE ofβ I 1 andβ I 2 are smaller than that ofβ W 1 and β W 2 , the GMSE ofĝ I 1 n (·) andĝ I 2 n (·) are smaller than that ofĝ W 1 n (·) andĝ W 2 n (·). It shows that the interpolation method is more effective than the delection method.
(ii) For every fixed n and p, the MSE ofβ W 2 andβ I 2 are very close to that ofβ W 1 and β I 1 , the GMSE ofĝ W 2 n (·) andĝ I 2 n (·) are close to that ofĝ W 1 n (·) andĝ I 2 n (·). (iii) For every fixed n, the MSE for the estimators of β and the GMSE for the estimators of g(·) increase as the increasing of p. (iv) For every fixed p, the MSE for the estimators of β and the GMSE for the estimators of g(·) all decrease as the increasing of n. (v) Fig. 1 shows that the variances of the estimators decrease on increasing of sample size n.

The fitting figure for the estimators of g(·)
In this section, we give the fitting figure ofĝ I 1 n (·) andĝ I 2 n (·) with p = 0.25. From Figs. 2-3, one can see that (i) for every fixed n, the graph for the estimators of g(·) is very close to g(·); (ii) for every fixed n, the graph ofĝ I 1 n (·) is very close toĝ I 2 n (·); (iii) the fitting effect is better on the increase of n; (iv) when n reaches 200, the fitting effect is ideal; (v) the simulation results are consistent with the theoretical results.

Preliminary lemmas
In the sequel, let C, C 1 , C 2 , . . . be some finite positive constants, whose values are unimportant and may change. Now, we introduce several lemmas, which will be used in the proof of the main results. Following the proof line of Lemma 4.7 in Zhang and Liang [21], one can verify the following two lemmas.

Proof of main results
Now, we introduce some notations which will be used in the proofs below.