- Review
- Open access
- Published:
Jackknifing for partially linear varying-coefficient errors-in-variables model with missing response at random
Journal of Inequalities and Applications volume 2020, Article number: 223 (2020)
Abstract
In this paper, we focus on the response mean of the partially linear varying-coefficient errors-in-variables model with missing response at random. A simulation study is conducted to compare jackknife empirical likelihood method with normal approximation method in terms of coverage probabilities and average interval lengths, and a comparison of the proposed estimators is done based on their biases and mean square errors.
1 Introduction
In statistical problems, various forms of statistical models are widely sought by many researchers. As a natural compromise between the parametric models and the nonparametric models, semi-parametric regression models allow some predictors to be modeled parametrical and others being modeled non-parametrical, which motivates us to consider the following partially linear varying-coefficient (PLVC) model:
where Y is a response variable, \((X,W)\in \mathbb{R}^{p}\times \mathbb{R}^{q}\) are covariates, \(\beta =(\beta _{1},\ldots ,\beta _{p})^{\top }\) is a vector of p-dimensional unknown parameters, \(\alpha (\cdot )\) is an unknown q-dimensional vector of the coefficient function, ε is a random error with \(E(\varepsilon |X,W,U)=0\). To avoid the curse of dimensionality, we assume that U is univariate. As one combination of partially linear model and varying-coefficient model, PLVC model has drawn much attention. For example, Kai et al. [9] discussed the variable selection by the composite quantile regression method. He et al. [6] developed an approximate estimator of the functional coefficients by B-spline function and studied the asymptotic properties of the proposed estimators. Shen and Liang [17] considered weighted quantile regression and variable selection under righted-censored data with missing censoring indicators. For more work see Fan and Huang [4], You and Zhou [29], Huang and Zhang [7] among others.
In this paper, we are interested in estimating the mean of response Y, say θ, in model (1.1) when the covariate X is measured with error. We use the surrogate ξ instead of observing X. Hence, we assume the following additive errors-in-variables (EV) model:
where e is the measurement error with zero mean and covariance matrix \(\Sigma _{e}\) (which is known). The combination of (1.1) with (1.2) is named the partially linear varying-coefficient errors-in-variables (PLVC EV) model, which has been studied by many authors. For example, Fan et al. [2] considered the penalized empirical likelihood and variable selection for high-dimension data. Liu and Liang [12] constructed the asymptotical normality of jackknife estimator for error variance and standard chi-square distribution of jackknife empirical log-likelihood statistic. Fan et al. [3] established penalized profile least squares estimation of parameter and non-parameter in the model. Xu et al. [24] proved the asymptotic properties of the proposed estimators for parameter and coefficient function, and studied asymptotic distribution of empirical log-likelihood ratio function for parameter with missing covariates.
In many practical fields, however, not all response variables can be available for various reasons. For instance, in public opinion poll, non-response is a typical source of missing values. Due to the presence of missing data, the traditional and standard inference procedures cannot be applied directly. A common approach to dealing with missing data is the complete case (CC) analysis, which only uses data with complete observations and is in the loss of information when the missing mechanism is missing at random (MAR). To eliminate this disadvantage, the imputation method is one method of filling in the missing response, which includes linear regression (Yates [27], Wang and Rao [21, 23]), kernel regression imputation (Cheng [1], Wang and Rao [22]), ratio imputation (Rao [16]) and so on. These methods are widely used by many statisticians. Wang et al. [20] proposed an imputation estimator and a number of propensity score weighting estimators, which are consistent and asymptotically normal. Liang [10] extended the idea of Wang et al. [20] to consider partially linear regression model with error-prone covariates. Xue [25] used a weighted linear regression imputation to construct a weighted-corrected empirical likelihood ratio of the response mean so that the ratio has an asymptotic chi-squared distribution. Tang and Zhao [18] proposed imputed empirical likelihood-based estimator for the response mean of the nonlinear regression models.
Throughout this paper, we are interested in inference of the mean of response Y with the missing response at random in the PLVC EV model. Hence, we obtain the following incomplete observations \(\{Y,W,U,\delta ,\xi \}\), where ξ, W and U are observed, Y may be missing, and \(\delta =0\) if Y is missing, otherwise \(\delta =1\). We assume that Y is MAR, which implies that δ and Y are conditionally independent given X, W and U, that is, \(P(\delta =1|Y,X,W,U)=P(\delta =1|X,W,U):=P(Z)\) with \(Z=(X,W,U)\) and the probability function \(P(\cdot )\) represents the heterogeneity in the missingness mechanism. The MAR assumption is common in statistical analysis with missing data and is reasonable in many practical situations; see Little and Rubin [11].
As is well known, the empirical likelihood, introduced by Owen [13, 14], has many advantages over the normal approximation and bootstrap approximation for constructing the confidence intervals. For example, the empirical likelihood method does not involve the variance estimation because of the complicated variance estimation. Meanwhile, the sharp and orientation of confidence regions based on the empirical likelihood method are determined entirely by the data. However, the estimation based on empirical likelihood method will be computationally difficult and the Wilks theorem does not hold in general. In order to handle the situation where nonlinear statistics are involved, Jing et al. [8] proposed a new approach called jackknife empirical likelihood. Thanks to its advantages, the jackknife empirical likelihood approach has been applied by many researchers. See Gong et al. [5], Peng et al. [15], Yang and Zhao [26], Liu and Liang [12], Yu and Zhao [30] and so on. However, there is a little literature considering the jackknife method for response mean with missing response at random.
In this paper, we are interested in the statistical inference of the mean of response Y in the PLVC EV model with missing response at random, especially the confidence regions of the response mean. In order to avoid the difficulty of calculation and ensure that the Wills phenomenon is established, we consider the jackknife empirical likelihood method instead of empirical likelihood method. In the spirit of Wang et al. [20], we propose the marginal average estimator, the regression imputation estimator and the augmented inverse probability estimator of the response mean by imputing every missing response variable. At the same time, the corresponding jackknife estimators of the response mean are defined. The estimators are consistent and asymptotical normality under some assumptions. We also establish the asymptotic distribution of the jackknife empirical log-likelihood ratio function and construct the confidence regions. A simulation study is done to evaluate the performance of the proposed methods.
The rest of this paper is organized as follows. In Sect. 2, we give the methodologies and build the estimators. The main results are listed in Sect. 3. A simulation study is conducted in Sect. 4. Our conclusion is drawn in Sect. 5. The proofs of the main results and some lemmas are provided in the Appendix.
2 Methodology
2.1 Estimation
For convenience, we assume that the \(X_{i}\) is directly observable. The estimators of parameter β and coefficient function \(\alpha (\cdot )\) can be obtained by profile least squares method as follows. Having multiplied model (1.1) by the observation indicators, then we have
For given β, we apply the local weighted least squared method to estimate the coefficient function \(\{\alpha _{j}(\cdot ),j=1,\ldots ,q\}\). For u in a small neighborhood of \(u_{0}\), the Taylor expansion for \(\alpha _{j}(u)\) can be written as
We minimize the following objective function to get \(\{(a_{j},b_{j}),j=1,\ldots ,q\}\):
where \(K_{h_{n}}(\cdot )=\frac{1}{{h_{n}}}K(\cdot /{h_{n}})\) is a kernel function and \(0< h_{n}\rightarrow 0\) is a bandwidth sequence.
Let \(\mathbf{X}=(X_{1},X_{2},\ldots ,X_{n})^{\top }\), \(\mathbf{Y}=(Y_{1},Y_{2},\ldots ,Y_{n})^{\top }\), \(\mathbf{W}=(W_{1},W_{2},\ldots ,W_{n})^{\top }\), \(\omega _{u}^{\delta }= \operatorname{diag}(\delta _{1}K_{h_{n}}(U_{1}-u),\ldots , \delta _{n}K_{h_{n}}(U_{n}-u))\) and
Therefore, when β is known, we can obtain the estimator of \(\alpha (\cdot )\) by
Substituting (2.2) into (2.1) and eliminating the bias produced by the measurement errors, we give the modified profile least squared estimator of β as follows:
where \(S_{i}=(W_{i}^{\top }\enskip0)(D_{u}^{\top }\omega _{u}^{\delta } D_{u})^{-1}D_{u}^{\top }\omega _{u}^{\delta }\), \(\widetilde{Y}_{i}=Y_{i}-S_{i}\mathbf{Y}\), \(\widetilde{\xi }_{i}=\xi _{i}^{\top }-S_{i}\xi \) with \(\xi =(\xi _{1},\ldots ,\xi _{n})^{\top }\). Hence, one can get the following local linear regression estimator of \(\alpha (\cdot )\):
By Wang et al. [20], we consider the response mean θ by the following general class of estimators:
where \(P_{n}^{*}(z)\) is some sequence of quantities with probability limits \(P(z)\). When \(P_{n}^{*}(z)=\infty \), \(\widehat{\theta }_{n}\) reduces to the following marginal average estimator:
\(P_{n}^{*}(z)=1\), \(\widehat{\theta }_{n}\) reduces to the following regression imputation estimator:
When \(P_{n}^{*}(z)=\widehat{P}_{n}(z)\), \(\widehat{\theta }_{n}\) reduces to the following augmented inverse probability estimator:
where \(\widehat{P}_{n}(z)= \frac{\sum_{i=1}^{n}\delta _{i}\Omega _{b_{n}}(z-Z_{i})}{\sum_{i=1}^{n}\Omega _{b_{n}}(z-Z_{i})}\) with kernel function \(\Omega _{b_{n}}(\cdot )=\frac{1}{b_{n}}\Omega (\cdot /b_{n})\) and bandwidth sequence \(0< b_{n}\rightarrow 0\).
2.2 Jackknife empirical likelihood
In order to avoid the covariance matrix estimation, this subsection proposes the jackknife empirical likelihood method to construct the confidence regions for θ. Let \(\widehat{\beta }_{n,-i}\) be the estimator of β when the ith observation is deleted, that is,
Note that the definitions of \(\widehat{\theta }_{n}^{(k)}\) (\(k=1,2,3\)) in (2.4)–(2.6), one can re-write \(\widehat{\theta }_{n}^{(1)}=\frac{1}{n} \sum_{i=1}^{n} \{S_{i}Y_{i}+ \widetilde{\xi }_{i}^{\top }\widehat{\beta }_{n} \}\), \(\widehat{\theta }_{n}^{(2)}= \frac{1}{n}\sum_{i=1}^{n} \{\delta _{i}Y_{i}+(1-\delta _{i}) (S_{i}Y_{i}+ \widetilde{\xi }_{i}^{\top }\widehat{\beta }_{n} ) \}\) and \(\widehat{\theta }_{n}^{(3)}=\frac{1}{n}\sum_{i=1}^{n} \{ \frac{\delta _{i}}{\widehat{P}_{n}(Z_{i})}Y_{i}+ (1- \frac{\delta _{i}}{\widehat{P}_{n}(Z_{i})} ) (S_{i}Y_{i}+ \widetilde{\xi }_{i}^{\top }\widehat{\beta }_{n} ) \} \). Let \(\widehat{\theta }_{n,-i}^{(k)}\) be the estimator of θ when the ith observation is deleted for \(k=1,2,3\), which are defined by \(\widehat{\theta }_{n,-i}^{(1)}=\frac{1}{n-1}\sum_{j \neq i}^{n} \{S_{j}Y_{j}+\widetilde{\xi }_{j}^{\top }\widehat{\beta }_{n,-i} \}\), \(\widehat{\theta }_{n,-i}^{(2)}=\frac{1}{n-1}\sum_{j \neq i}^{n} \{ \delta _{j}Y_{j}+(1-\delta _{j}) (S_{j}Y_{j}+\widetilde{\xi }_{j}^{\top } \widehat{\beta }_{n,-i} ) \}\) and \(\widehat{\theta }_{n,-i}^{(3)}=\frac{1}{n-1}\sum_{j \neq i}^{n} \{ \frac{\delta _{j}}{\widehat{P}_{n,-i}(Z_{j})}Y_{j}+ (1- \frac{\delta _{j}}{\widehat{P}_{n,-i}(Z_{j})} ) (S_{j}Y_{j}+ \widetilde{\xi }_{j}^{\top }\widehat{\beta }_{n,-i} ) \} \), where \(\widehat{P}_{n,-i}(\cdot )\) is the estimator of \(P(\cdot )\) when the ith observation is deleted, that is, \(\widehat{P}_{n,-i}(z)=\sum_{j\neq i}^{n}\delta _{j}\Omega ( \frac{z-Z_{j}}{b_{n}})/\sum_{j\neq i}^{n}\Omega ( \frac{z-Z_{j}}{b_{n}})\). Then we have the ith jackknife pseudo samples \(\widehat{\theta }_{J_{i}}^{(k)}=n\widehat{\theta }_{n}^{(k)}-(n-1) \widehat{\theta }_{n,-i}^{(k)}\) and the jackknife estimators of θ are defined as follows:
Hence, the following jackknife empirical likelihoods of θ are constructed based on the jackknife pseudo-samples:
Using the Lagrange multipliers, we get the jackknife empirical log-likelihood ratio functions
where λ is the solution to the equation
3 Main results
Throughout this paper, let C denote finite positive constants, whose values may change in different scenarios.
-
(1)
Let \(\mu _{k}=\int u^{k}K(u)\,du\) and \(\gamma _{n}=\{\log n/(nh_{n})\}^{1/2}+h_{n}^{2}\).
-
(2)
Let \(A=E[\delta _{1}(X_{1}-W_{1}^{\top }\Gamma ^{-1}(U_{1})\Phi (U_{1}))]\), \(\Sigma _{1}=E[\xi _{1}-W_{1}^{\top }\Gamma ^{-1}(U_{1})\Phi (U_{1})]\) and \(\Sigma _{2}=E[(1-\delta _{1})(\xi _{1}-W_{1}^{\top }\Gamma ^{-1}(U_{1}) \Phi (U_{1}))]\).
-
(3)
Let \(\Gamma (U_{1})=E[\delta _{1} W_{1}W_{1}^{\top }|U_{1}]\), \(\Phi (U_{1})=E[\delta _{1} X_{1}W_{1}|U_{1}]\),
$$\begin{aligned}& \begin{aligned} \Lambda _{1}={}&E \bigl\{ X_{1}^{\top }\beta +W_{1}^{\top }\alpha (U_{1})- \theta \bigr\} ^{2} \\ &{} +E \bigl\{ e_{1}^{\top }\beta +A^{-1}\Sigma _{1}\delta _{1}\bigl(\xi _{1}^{\top }-W_{1}^{\top } \Gamma ^{-1}(U_{1})\Phi (U_{1})\bigr) \bigl( \varepsilon _{1}-e_{1}^{\top }\bigr)+\Sigma _{e}\beta \bigr\} ^{2}, \end{aligned} \\& \begin{aligned} \Lambda _{2}={}&E \bigl\{ X_{1}^{\top }\beta +W_{1}^{\top }\alpha (U_{1})- \theta \bigr\} ^{2} \\ &{} +E \bigl\{ (1-\delta _{1})e_{1}^{\top }\beta +\delta _{1}\varepsilon _{1}+A^{-1} \Sigma _{2}\delta _{1}\bigl(\xi _{1}^{\top }-W_{1}^{\top } \Gamma ^{-1}(U_{1}) \Phi (U_{1})\bigr) \bigl( \varepsilon _{1}-e_{1}^{\top }\bigr) \\ &{}+\Sigma _{e}\beta \bigr\} ^{2}, \end{aligned} \\& \Lambda _{3}=E \bigl\{ X_{1}^{\top }\beta +W_{1}^{\top }\alpha (U_{1})- \theta \bigr\} ^{2}+E \biggl\{ \frac{\delta _{1}}{P(Z_{1})}\varepsilon _{1}+e_{1}^{\top } \beta \biggr\} ^{2}. \end{aligned}$$
In order to formulate the main results, we need to impose the following assumptions.
-
(A1)
The random variable U has bounded support \(\mathcal{U}\) and its density function \(g(\cdot )\) is Lipschitz continuous and far away from zero. The density function of Z, \(f(z)\), is bounded away from zero and has bounded continuous second derivatives.
-
(A2)
The matric \(\Gamma (U_{1})\) is nonsingular for each \(U_{1}\in \mathcal{U}\). \(\Gamma (U_{1})\), \(E(\delta _{1} X_{1}X_{1}^{\top }|U_{1})\) and \(\Phi (U_{1})\) are Lipschitz continuous.
-
(A3)
There is one \(s>2\) such that \(E(\|X_{1}\|^{2s}|U_{1})<\infty \) a.s., \(E(\|W_{1}\|^{2s}|U_{1})<\infty \) a.s., \(E(\|\xi _{1}\|^{2s})<\infty \) a.s. and \(E(\|\varepsilon _{1}\|^{2s}|X_{1},W_{1})<\infty \) a.s.
-
(A4)
The coefficient functions \(\{\alpha _{j}(\cdot ),j=1,2,\ldots ,q\}\) have continuous second derivatives.
-
(A5)
\(P(z)\) has bounded partial derivatives up to order 2 almost surely and \(\inf_{z} P(z)>0\).
-
(A6)
\(K(t)\) is a bounded kernel function of order 2 with bounded support, and has bounded partial derivatives up to order 2 almost surely.
-
(A7)
\(\Omega (\cdot )\) is bounded kernel function of order \(r\ (>2)\) with bounded support and has a bounded partial derivative.
-
(A8)
The bandwidths \(h_{n}\) and \(b_{n}\) satisfy \(nh_{n}^{8}\to 0\), \(nh_{n}^{2}/(\log n)^{2}\rightarrow \infty \), \(nb_{n}^{2(p+q+1)}/\log n\rightarrow \infty \) and \(nb_{n}^{2r}\to 0\).
Remark 3.1
Assumptions (A1)–(A4) are standard conditions, which are commonly used in the literature; see Fan and Huang [4], Liu and Liang [12]. Assumption (A5) is always applied on missing data analysis; see Wang et al. [20]. Assumptions (A6) and (A7) are used in the investigation on some nonparametric kernel estimators. Assumption (A8) implies the relationship between sample size and bandwidths.
We consider the asymptotic normality of the profile least square estimators and jackknife estimator of the response mean in Theorems 3.1 and 3.2. Also, we give the asymptotic distributions of \(l^{(k)}(\theta )\) for \(k=1,2,3\) in Theorem 3.3 and construct the confidence regions of θ.
Theorem 3.1
Suppose that Assumptions (A1)–(A8) hold, then for \(k=1,2,3\)we have
Theorem 3.2
Suppose that the assumptions of Theorem 3.1hold, then, for \(k=1,2,3\), we have \(\sqrt{n}(\widehat{\theta }_{J}^{(k)}-\theta )=\sqrt{n}( \widehat{\theta }_{n}^{(k)}-\theta )+o_{p}(1)\). Further, we have \(\sqrt{n}(\widehat{\theta }_{J}^{(k)}-\theta )\stackrel{\mathcal{D}}{\to } N(0, \Lambda _{k})\).
Theorem 3.3
Suppose that the assumptions of Theorem 3.1hold. If θ is the true value, then for \(k=1,2,3\)we have
where \(\chi _{1}^{2}\)is independent standard chi-square random variables with 1 degree of freedom.
Remark 3.2
From Theorem 3.3, it follows immediately that an approximation \(1-\tau \) confidence regions for θ are given by \(I_{\tau }=\{\theta : l^{(k)}(\theta )\leq \chi ^{2}_{1}(\tau )\}\), where \(\chi ^{2}_{1}(\tau )\) is the upper τ-quantile of the distribution of \(\chi ^{2}_{1}\). In view of Theorem 3.2, one can construct the confidence regions for θ by estimating the variance \(\Lambda _{k}\). The jackknife empirical likelihood method does not relate to an estimation of the asymptotic variance, which makes it more efficient than the normal approximation method. This phenomenon is also exhibited in a simulation study.
4 Simulation study
In this section, we carry out some simulations to demonstrate the finite sample performance of the profile least square estimators and jackknife estimators by comparing their bias and mean square error (MSE). Besides, we compare the jackknife empirical likelihood (JEL) method with normal approximation (NA) in terms of the coverage probability (CP) and average interval length (AL).
The data are generated from the following PLVC EV model:
where \(X_{i1}\sim N(0,1)\), \(X_{i2}\sim N(0,1)\), \(W_{i}\sim N(0,1)\), \(U_{i}\sim U(0,1)\), \(\beta _{1}=1\), \(\beta _{2}=2\), \(\alpha (u)=2\sin (6\pi u)\) and \(\varepsilon _{i}\sim N(0,1)\). The measurement error \(e_{i}\sim N(0,\Sigma _{e})\). To represent different levels of measurement errors, we take \(\Sigma _{e}\) as \(\Sigma _{e1}=\operatorname{diag}(0.25,0.25)\) and \(\Sigma _{e2}=\operatorname{diag}(0.5,0.5)\) in the simulations, respectively. The kernel functions \(K(u)=\frac{3}{4}(1-u^{2})I(|u|\leq 1)\) and \(\Omega (x,w,u)=\Omega _{1}(x)\Omega _{2}(w)\Omega _{3}(u)\) where \(\Omega _{1}(t)=\Omega _{2}(t)=\Omega _{3}(t)=\frac{15}{16}(1-t^{2})^{2}I(|t| \leq 1)\). The bandwidth \(h_{n}=n^{-1/5}\) and \(b_{n}=n^{-1/3}\).
We generate 500 Monte Carlo random samples of size 50, 100, 150 and 60, 90, 120 based on the following six cases, repeatedly.
-
(1)
Case 1: \(P(x,w,u)=1/[1+\exp (-X_{i1}-X_{i2}-W_{i}-U_{i}-5.5)]\);
-
(2)
Case 2: \(P(x,w,u)=1/[1+\exp (-X_{i1}-0.5X_{i2}-W_{i}-U_{i}-2)]\);
-
(3)
Case 3: \(P(x,w,u)=1/[1+\exp (-0.75X_{i1}-X_{i2}-W_{i}-U_{i}-1)]\);
-
(4)
Case 4: \(P(x,w,u)=1/[1+\exp (-0.5X_{i1}-0.5X_{i2}-5W_{i}-5.5U_{i}-1)]\);
-
(5)
Case 5: \(P(x,w,u)=1-1/[1+\exp (-0.5X_{i1}-0.5X_{i2}-5W_{i}-5.5U_{i}-1)]\);
-
(6)
Case 6: \(P(x,w,u)=1-1/[1+\exp (-X_{i1}-X_{i2}-W_{i}-2.4U_{i}-0.5)]\).
The average missing rate (MR) corresponding to the above six cases are 10%, 20%, 30%, 45%, 55%, 65%, respectively.
In Tables 1–2, we calculate biases and MSEs of \(\widehat{\theta }_{n}^{(k)}\) and \(\widehat{\theta }_{J}^{(k)}\) for \(k=1,2,3\), respectively, to evaluate their finite sample performance. The simulation results indicate the following conclusions. The larger MRs and/or measurement errors produce bigger biases and MSEs. The biases and MSEs decrease as the sample size increases. Both biases and MSEs of \(\widehat{\theta }_{J}^{(k)}\) are smaller than those of \(\widehat{\theta }_{n}^{(k)}\) under the same settings. In other words, the jackknife estimators \(\widehat{\theta }_{J}^{(k)}\) perform better than \(\widehat{\theta }_{n}^{(k)}\). Besides, the augment inverse probability estimator \(\widehat{\theta }_{n}^{(3)}\) performs best, and \(\widehat{\theta }_{n}^{(1)}\) is worst. The corresponding jackknife estimators enjoy the same conclusion.
In Tables 3–4, we give the CPs and ALs for JEL method and NA method on response mean θ. The CPs for JEL method and NA method decrease as MRs, measurement errors decrease and confidence levels increase, and increase as the sample size increases. Besides, the JEL method outperforms the NA method in terms of coverage probability. The CPs for JEL method are larger than those of NA method under the same settings. For both methods, when we have MRs, measurement errors and confidence levels increase, the ALs are getting longer. When the sample size increases, the ALs are getting shorter. The ALs for JEL method are larger than those of NA method under the same settings.
5 Conclusion
In this paper, we focus on the response mean of the PLVC EV model with missing response at random. Inspired by Wang et al. [20], we propose the marginal average estimator, the regression imputation estimator and the augmented inverse probability estimator of the response mean to deal with the missing response variable. In order to construct the confidence regions of the response mean, we define the corresponding jackknife estimators and establish the jackknife empirical log-likelihood ratio functions of the response mean. Meanwhile, the consistency and asymptotical normality of the estimators are proved under some assumptions. We also establish the asymptotic chi-square distribution of the jackknife empirical log-likelihood ratio functions and construct the confidence regions for the estimators of the response mean. Finally, one simulation study is conducted to compare jackknife empirical likelihood method with normal approximation method in terms of coverage probabilities and average interval lengths, and one comparison of the proposed estimators is done based on their biases and mean square errors.
References
Cheng, P.E.: Nonparametric estimation of mean functionals with data missing at random. J. Am. Stat. Assoc. 89, 81–87 (1994)
Fan, G.L., Liang, H.Y., Shen, Y.: Penalized empirical likelihood for high-dimensional partially linear varying coefficient model with measurement errors. J. Multivar. Anal. 147, 183–201 (2016)
Fan, G.L., Liang, H.Y., Zhu, L.X.: Penalized profile least squares-based statistical inference for varying coefficient partially linear errors-in-variables models. Sci. China Math. 61, 1677–1694 (2018)
Fan, J., Huang, T.: Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11(6), 1031–1057 (2005)
Gong, Y., Peng, L., Qi, Y.: Smoothed jackknife empirical likelihood method for ROC curve. J. Multivar. Anal. 101, 1520–1531 (2010)
He, X.X., Feng, X.T., Zhao, X.: Semiparametric partially linear varying coefficient models with panel count data. Lifetime Data Anal. 23(3), 439–466 (2017)
Huang, Z., Zhang, R.: Empirical likelihood for nonparametric parts in semiparametric varying-coefficient partially linear models. Stat. Probab. Lett. 79, 1798–1808 (2009)
Jing, B.Y., Yuan, J., Zhou, W.: Jackknife empirical likelihood. J. Am. Stat. Assoc. 104, 1224–1232 (2009)
Kai, B., Li, R., Zou, H.: New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Stat. 39, 305–332 (2011)
Liang, H.: Partially linear models with missing response variables and error-prone covariates. Biometrika 94(1), 185–198 (2007)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
Liu, A.A., Liang, H.Y.: Jackknife empirical likelihood of error variance in partially linear varying-coefficient errors-in-variables models. Stat. Pap. 58(1), 1–28 (2017)
Owen, A.B.: Empirical likelihood ratio confidence intervals for a single function. Biometrika 75, 237–249 (1988)
Owen, A.B.: Empirical likelihood ratio confidence regions. Ann. Stat. 18(1), 90–120 (1990)
Peng, L., Qi, Y., Van Keilegom, I.: Jackknife empirical likelihood method for copulas. Test 21, 74–92 (2012)
Rao, J.N.K.: On variance estimation with imputed survey data (with discussion). J. Am. Stat. Assoc. 91, 499–520 (1996)
Shen, Y., Liang, H.Y.: Quantile regression for partially linear varying-coefficient model with censoring indicators missing at random. Comput. Stat. Data Anal. 117, 1–18 (2017)
Tang, N.S., Zhao, P.Y.: Empirical likelihood-based inference in nonlinear regression models with missing responses at random. Statistics 47(4–6), 1141–1159 (2013)
Wang, Q.: Statistical estimation in partial linear models with covariate data missing at random. Ann. Inst. Stat. Math. 61(1), 47–84 (2009)
Wang, Q., Linton, O., Härdle, W.: Semiparametric regression analysis with missing response at random. J. Am. Stat. Assoc. 99(466), 334–345 (2004)
Wang, Q.H., Rao, J.N.K.: Empirical likelihood for linear regression models under imputation for missing responses. Can. J. Stat. 29, 597–608 (2001)
Wang, Q.H., Rao, J.N.K.: Empirical likelihood-based inference under imputation with missing response. Ann. Stat. 30, 896–924 (2002)
Wang, Q.H., Rao, J.N.K.: Empirical likelihood-based inference in linear models with missing data. Scand. J. Stat. 29, 563–576 (2002)
Xu, H.X., Fan, G.L., Wu, C.X., Chen, Z.L.: Statistical inference for varying-coefficient partially linear errors-in-variables models with missing data. Commun. Stat., Theory Methods 48(22), 5621–5636 (2019)
Xue, L.: Empirical likelihood for linear models with missing responses. J. Multivar. Anal. 100(7), 1353–1366 (2009)
Yang, H., Zhao, Y.: Smoothed jackknife empirical likelihood inference for ROC curves with missing data. J. Multivar. Anal. 140, 123–138 (2015)
Yates, F.: The analysis of replicated experiments when the field results are incomplete. Emp. J. Exp. Agric. 1, 129–142 (1933)
You, J.H., Chen, G.: Estimation of a semiparametric varying-coefficient partially linear errors-in-variables model. J. Multivar. Anal. 97(2), 324–341 (2006)
You, J.H., Zhou, Y.: Empirical likelihood for semiparametric varying-coefficient partially linear regression models. Stat. Probab. Lett. 76(4), 412–422 (2006)
Yu, X., Zhao, Y.: Jackknife empirical likelihood inference for the accelerated failure time model. Test 28, 269–288 (2019)
Acknowledgements
The authors thank the editor, the associate editor and two anonymous referees for their constructive comments and suggestions.
Availability of data and materials
This paper did not use real data. Simulation data was produced by Matlab software.
Funding
This work was supported by the China Postdoctoral Science Foundation (2019M651422), the National Natural Science Foundation of China (71701127) and the Key Project of National Social and Scientific Fund Program (18ZDA052).
Author information
Authors and Affiliations
Contributions
YZ gave the framework of the article and completed the theoretical proof of the article. CW completed the simulation analysis. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Appendix
Appendix
To prove Theorems 3.1–3.3, we need the following lemmas.
Lemma A.1
Suppose that Assumptions (A1)–(A8) hold, then, as \(n\rightarrow \infty \), we have
Proof
The proof of Lemma A.1 is similar to that of Lemma A.2 in You and Chen [28]. □
Lemma A.2
Suppose that Assumptions (A1)–(A8) hold, then we have
Proof
The proof of Lemma A.2 is similar to Eq. (31) in Wang [19]. □
Lemma A.3
Suppose that Assumptions (A1)–(A8) hold, then we have
Proof
Let \(A_{n}=\frac{1}{n}\sum_{i=1}^{n}\delta _{i}(\widetilde{\xi }_{i} \widetilde{\xi }_{i}^{\top }-\Sigma _{e})\) and \(e=(e_{1},\ldots ,e_{n})^{\top }\), we have
From (A.1), (A.3) and Assumption (A5), one simple computation yields \(S_{i}\varepsilon =W_{i}^{\top } \mathbf{1}_{q}O_{p}(\sqrt{ \frac{\log n}{nh_{n}}})\), where \(\mathbf{1}_{q}=(1_{1},\ldots ,1_{q})\). Then one can get
Note that \(M_{i}-S_{i}M=W_{i}^{\top }\alpha (U_{i})O_{p}(\gamma _{n})\), then \(T_{2}=o_{p}(n^{-1/2})\). Based on (A.1), one simple calculation yields
Since e is independent of \((Y,X,W,U)\) with zero mean, it can be checked that
Hence, simple arguments suggest that
Therefore, collecting the results above, Lemma A.3 is proved. □
Lemma A.4
Suppose that Assumptions (A1)–(A8) hold, then we have
Proof
From the definition of \(\widehat{\alpha }_{n}(\cdot )\), then one can get
Since \(S_{i}e=0\), we have \(D_{2}=o_{p}(1)\). Due to \(S_{i}\varepsilon =W_{i}\mathbf{1}_{q}O_{p}(\sqrt{ \frac{\log n}{nh_{n}}})\), we have \(D_{3}=o_{p}(1)\). Since \(S_{i}M-M_{i}=W_{i}^{\top }\alpha (U_{i})O_{p}(\gamma _{n})\), it follows that \(D_{4}=o_{p}(1)\). Hence, we have completed the proof of Lemma A.4. □
Lemma A.5
Suppose that Assumptions (A1)–(A8) hold, then we have
Proof
(a) We prove Lemma A.5 for \(\widehat{\theta }_{n}^{(1)}\). Note that
From the proof of (a) in Theorem 3.1, it is easy to prove
Following the fact \(\max_{1\leq i\leq n}\|\widehat{\beta }_{n}-\widehat{\beta }_{n,-i}\|=O_{p}(n^{-1})\) and \(\max_{1\leq i\leq n}|\widetilde{\xi }_{i}|=o(n^{1/(2s)})\), from Assumption (A8), we have \(\max_{1\leq i\leq n}|b_{ki}|=o_{p}(n^{-1/2})\) for \(k=2,3\). Theorem 3.1 implies that \(\widehat{\theta }_{n}^{(1)}-\theta =O_{p}(n^{-1/2})\), it can be checked that \(\max_{1\leq i\leq n}|b_{4i}|=o_{p}(n^{-1/2})\). Simple computation yields \(S_{i}M-M_{i}=W_{i}^{\top }\alpha (U_{i})O_{p}(\gamma _{n})\), then we have \(\max_{1\leq i\leq n}|b_{5i}|=o_{p}(n^{-1/2})\). Note that \(S_{i}\varepsilon =W_{i}^{\top }\mathbf{1}_{q}O_{p}(\sqrt{ \frac{\log n}{nh_{n}}})\) and \(S_{i}e=0\), one can get \(\max_{1\leq i\leq n}|b_{6i}|=o_{p}(n^{-1/2})\). Hence, we have
(b) Following the definitions of \(\widehat{\theta }_{n}^{(2)}\) and \(\widehat{\theta }_{n,-i}^{(2)}\), simply computations yield
According to the proof of (b) in Theorem 3.1, then one can get
By similar arguments to that of (a), we have \(\max_{1\leq i\leq n}|l_{ki}|=o_{p}(n^{-1/2})\) for \(k=2,\ldots ,7\). Hence, we have
(c) Note that
Standard calculations yield
From Lemma A.2 and the fact \(\widehat{P}_{n,-i}(z)-\widehat{P}_{n}(z)=O_{p}(n^{-1})\), by similar arguments to that of (a), we have \(\max_{1\leq i\leq n}|m_{ki}|=o_{p}(n^{-1/2})\) for \(k=2,\ldots ,8\). Hence, we have
□
Proof of Theorem 3.1
(a) We first prove Theorem 3.1 for \(\widehat{\theta }_{n}^{(1)}\). Recalling the definition of \(\widehat{\theta }_{n}^{(1)}\) in (2.4), then one can write
Following Lemma A.4 in the Appendix, it can be checked that
Combining \(A_{2}\) with \(A_{3}\), and from Lemma A.3, one can get
By the cental limit theorem, the proof of Theorem 3.1 for \(\widehat{\theta }_{n}^{(1)}\) is finished.
(b) We prove Theorem 3.1 for \(\widehat{\theta }_{n}^{(2)}\). In view of the definition of \(\widehat{\theta }_{n}^{(2)}\) in (2.4), by similar arguments to that of \(\widehat{\theta }_{n}^{(1)}\) in (a), then
For \(B_{5}\), applying the same proof as of Lemma A.4, it is easy to prove
by which together with \(B_{4}\), and following the verification of Lemma A.3, then we have
Based on (A.7) and (A.8), it follows that
By the cental limit theorem, the proof of Theorem 3.1 for \(\widehat{\theta }_{n}^{(2)}\) is completed.
(c) We prove Theorem 3.1 for \(\widehat{\theta }_{n}^{(3)}\). According to the definition of \(\widehat{\theta }_{n}^{(3)}\), from Lemma A.2, we have
For \(D_{1}\), we replace \(\widehat{P}_{n}(Z_{i})\) with its true value \(P(Z_{i})\), then
Recalling the definition of \(\widehat{P}_{n}(Z_{i})\) in Sect. 2 and Assumption (A8), we have
Under Assumptions (A1), (A5) and (A8), standard computation yields
From Assumption (A8), we have \(D_{121}=o_{p}(1)\). Similarly, \(D_{122}=o_{p}(1)\). Lemma A.2 implies \(D_{13}=o_{p}(1)\). Hence
Analogous to the arguments of \(D_{1}\), it is easy to prove
From Lemmas A.3 and A.4, and the missing mechanism, one simple computation yields \(D_{4}=o_{p}(1)\) and \(D_{5}=o_{p}(1)\). Hence, collecting the results above, (A.9)–(A.11), one can get
By the cental limit theorem, the proof of Theorem 3.1 for \(\widehat{\theta }_{n}^{(3)}\) is finished. □
Proof of Theorem 3.2
(a) We first prove Theorem 3.2 for \(\widehat{\theta }_{J}^{(1)}\). In order to verify \(\sqrt{n}(\widehat{\theta }_{J}^{(1)}-\theta )=\sqrt{n}(\widehat{\theta }_{n}^{(1)}- \theta )+o_{p}(1)\), it suffices prove that \(\widehat{\theta }_{J}^{(1)}=\widehat{\theta }_{n}^{(1)}+o_{p}(n^{-1/2})\). Recalling the definition of \(\widehat{\theta }_{J}^{(1)}\) given in Sect. 2, then one can re-write
Therefore, to obtain the desired results, we just need to prove
Following the definitions of \(\widehat{\theta }_{n}^{(1)}\) and \(\widehat{\theta }_{n,-i}^{(1)}\), then we have
From the proof of (6.16) in Liu and Liang [12], it can be checked that \(\sqrt{n}\sum_{i=1}^{n}(\widehat{\beta }_{n}-\widehat{\beta }_{n,-i})=o_{p}(1)\). According to Lemma A.5, one can write
Following Lemma 6.11 in Liu and Liang [12], it follows that \(\|\widehat{\beta }_{n}-\widehat{\beta }_{n,-i}\|=O_{p}(n^{-1})\). We combine this with the fact \(\max_{1\leq i\leq n}\|\widetilde{\xi }_{i}\|=o(n^{1/(2s)})\) for \(s>2\). Hence, under Assumption (A8), we have
Hence, from (A.13)–(A.15), it is easy to prove \(\sqrt{n}\sum_{i=1}^{n}(\widehat{\theta }_{n}^{(1)}-\widehat{\theta }_{n,-i}^{(1)})=o_{p}(1)\).
(b) The definition of \(\widehat{\theta }_{n}^{(2)}\) can be re-written as \(\widehat{\theta }_{n}^{(2)}=\frac{1}{\sqrt{n}}\sum_{i=1}^{n} \delta _{i}(\widetilde{Y}_{i}- \widetilde{\xi }_{i}^{\top }\widehat{\beta }_{n})+\widehat{\theta }_{n}^{(1)} \). Hence, it is easy to prove
By a similar argument to that of (a), \(J_{1}=o_{p}(1)\) and \(J_{2}=o_{p}(1)\) can be proved easily. From the result (A.12), then we have \(J_{3}=o_{p}(1)\). Hence, \(\sqrt{n}\sum_{i=1}^{n}(\widehat{\theta }_{n}^{(2)}-\widehat{\theta }_{n,-i}^{(2)})=o_{p}(1)\) can be proved.
(c) Following the definition of \(\widehat{\theta }_{n}^{(3)}\), we find \(\widehat{\theta }_{n}^{(3)}=\frac{1}{\sqrt{n}}\sum_{i=1}^{n} \frac{\delta _{i}}{\widehat{P}_{n}(Z_{i})}(\widetilde{Y}_{i}- \widetilde{\xi }_{i}^{\top }\widehat{\beta }_{n})+\widehat{\theta }_{n}^{(1)} \). Hence, simple calculation yields
Note that \(\widetilde{Y}_{j}-\widetilde{\xi }_{j}^{\top }\widehat{\beta }_{n}= \varepsilon _{j}-e_{j}^{\top }\beta +\widetilde{\xi }_{j}^{\top }(\beta - \widehat{\beta }_{n})+W_{j}^{\top }(\alpha (U_{j})-\widehat{\alpha }_{n}(U_{j}))\), then one can get
Let \(a_{ni}= \frac{\Omega (\frac{z-Z_{i}}{b_{n}})}{\sum_{j=1}^{n}\Omega (\frac{z-Z_{j}}{b_{n}})}\), from Assumption (A7), it is easy to prove \(a_{ni}(z)=O_{p}(n^{-1})\). Simple computation yields
Hence, applying the equation above, it follows that
which indicates \(\widehat{P}_{n,-i}(z)=\widehat{P}_{n}(z)+O_{p}(n^{-1})\). Together with Lemma A.2, one can compute
Under Assumptions (A1), (A5) and (A7), \(E(e_{j})=0\) and \(E(\varepsilon _{j}|Z_{j})=0\), then it is easy to verify
For any random variable X, we have \(X=EX+O_{p}(\sqrt{\operatorname{Var}(X)})\). Then from Assumption (A8), we have \(L_{111}=o_{p}(1)\). Similarly, we have \(L_{112}=o_{p}(1)\) and \(L_{113}=o_{p}(1)\). Analogous to the proof of \(L_{11}\), and from Lemmas A.3 and A.4, it is easy to prove \(L_{12}=o_{p}(1)\) and \(L_{13}=o_{p}(1)\). Hence, we have \(L_{1}=o_{p}(1)\). Similarly, \(L_{4}=o_{p}(1)\) and \(L_{5}=o_{p}(1)\). Note that \(\|\widehat{\beta }_{n}-\widehat{\beta }_{n,-i}\|=O_{p}(n^{-1})\) and \(\max_{1\leq i\leq n}\|\widetilde{\xi }_{i}\|=o(n^{1/2s})\), which indicate \(L_{i}=o_{p}(1)\) for \(i=2,3,6\). From (a), we have \(L_{7}=o_{p}(1)\). Therefore, collecting the results above, one can get \(\sqrt{n}\sum_{i=1}^{n}(\widehat{\theta }_{n}^{(3)}-\widehat{\theta }_{n,-i}^{(3)})=o_{p}(1)\). □
Proof of Theorem 3.3
Let \(\eta ^{(k)}(\lambda )=\frac{1}{n}\sum_{i=1}^{n} \frac{\widehat{\theta }_{J_{i}}^{(k)}-\theta }{1+\lambda (\widehat{\theta }_{J_{i}}^{(k)}-\theta )}\) for \(k=1,2,3\). It is easy to prove
where \(S_{J}^{(k)}=\frac{1}{n}\sum_{i=1}^{n}(\widehat{\theta }_{J_{i}}^{(k)}- \theta )^{2}\) and \(R_{n}^{(k)}=\max_{1\leq i\leq n}|\widehat{\theta }_{J_{i}}^{(k)}- \theta |\). Next, we just need to verify
Theorem 3.2 implies that \(S_{J}^{(k)}=\frac{1}{n}\sum_{i=1}^{n}(\widehat{\theta }_{J_{i}}^{(k)})^{2}- \theta ^{2}+o_{p}(1)\). Since \(\sqrt{n}\sum_{i=1}^{n}(\widehat{\theta }_{n}^{(k)}-\widehat{\theta }_{n,-i}^{(k)})=o_{p}(1)\) and \(\widehat{\theta }_{J_{i}}^{(k)}=n\widehat{\theta }_{n}^{(k)}-(n-1) \widehat{\theta }_{n,-i}^{(k)}\), we have
Lemma A.5 suggests that \(S_{J}^{(k)}\stackrel{\mathcal{P}}{\to }\Lambda _{k}\). Similar to Owen [13], we derive that \(\|\lambda \|=O_{P}(n^{-1/2})\). For convenience, let \(\zeta _{i}^{(k)}=\lambda (\widehat{\theta }_{J_{i}}^{(k)}-\theta )\), then
Note that
Applying (A.17) and (A.18), it is easy to derive that \(\frac{1}{n}\sum_{i=1}^{n}(\widehat{\theta }_{J_{i}}^{(k)}-\theta ) \cdot \frac{(\zeta _{i}^{(k)})^{2}}{1+\zeta _{i}^{(k)}}=o_{p}(n^{-1/2})\). Thus we have
Let \(\rho _{i}^{(k)}=\sum_{l=3}^{\infty }\frac{(-1)^{l-1}}{l}(\zeta _{i}^{(k)})^{l}=O(( \zeta _{i}^{(k)})^{3})\), then from (A.19), one can get \(|\sum_{i=1}^{n}\rho _{i}^{(k)}|\leq C\sum_{i=1}^{n}(\zeta _{i}^{(k)})^{3}=o_{p}(n^{-1/2})\). By Taylor expansion, we have
Finally, combining (A.17) with Theorem 3.2, the proof of Theorem 3.3 is finished. □
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zou, Y., Wu, C. Jackknifing for partially linear varying-coefficient errors-in-variables model with missing response at random. J Inequal Appl 2020, 223 (2020). https://doi.org/10.1186/s13660-020-02489-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13660-020-02489-4