You are viewing the new article page. Let us know what you think. Return to old version

# Empirical likelihood inference of parameters in nonlinear EV models with censored data

## Abstract

In this paper, the authors consider nonlinear error-in-variables regression models with right censored data. Based on the validation data, the authors use the Kaplan-Meier estimate and kernel estimate to repair the censored response variable Y and explanatory X with measurement error. Based on the repaired data, the authors introduce an auxiliary vector suitable to define an estimated empirical log-likelihood function of the unknown parameter which has an asymptotic weighted sum of the $\chi_{1}^{2}$ variables. Using the result the authors can construct the asymptotic confidence regions of β. But the method of estimating weights will reduce the precision of the confidence regions. Further, the authors adjust the preceding log-likelihood, and it is shown that the adjusted empirical log-likelihood has the asymptotic standard $\chi_{1}^{2}$ distribution. The result can be used to construct the confidence regions of β.

## Introduction

Consider the following nonlinear model:

$$Y=g(X,\beta)+\varepsilon,\qquad \widetilde{X}=\phi(X,e),$$
(1.1)

where X is p-variate explanatory variable, Y is a scalar response variable, and where $\beta=(\beta_{1},\beta_{2},\ldots,\beta_{p})^{\tau }$ is a $p\times1$ column vector of the unknown regression parameter, $g(\cdot)$ is a known measurable function, and ε is a random statistical error. In this model, X is an explanatory variable which cannot be observed directly, and is the observable substitute variable of X, where e is a measurement error, $\phi (\cdot)$ is an unconditional known function. Usually, there is a complicated relationship between and X. However, this situation presents serious difficulties toward obtaining a correct statistical analysis. One solution is to use validation data, and the solution has gained much attention by many scholars in recent years. For example: Sepanki and Lee  considered error-in-covariable nonlinear models with the help of validation data. Wang  considered the empirical likelihood inference of the error-in-covariable linear model and partially linear model based on validation data. Xue  used the validation data to explore the empirical likelihood inference of nonlinear semiparasitic error-in-variable models. Fang and Hu  considered the error-in-response to be nonlinear addressing the dimension reduction estimation of β with validation data.

In practice, the response Y’s may be censored by the censoring variable C. So we cannot observe $(X_{i},Y_{i})$, but we may observe

$$(\widetilde{X}_{i},Z_{i},\delta_{i}),\quad i=1, \ldots,n,$$

with $Z_{i}=\min(Y_{i},C_{i})$, $\delta_{i}=I(Y_{i}\leq C_{i})$, where $I(\cdot)$ is the indicator function and $\delta=0$ if $Y_{i}$ is censored, otherwise $\delta=1$. $(\widetilde{X}_{i},Z_{i},\delta_{i})$, $i=1,\ldots ,n$, are i.i.d. random samples from $(\widetilde{X},Z,\delta)$. Suppose that , Z, are given and δ is independent of Y. In fact, regression models with censored data have been researched in much of the literature. For example, Buckley and James  made an unbiased change for data and established a new regression model. Based on , Li and Wang  supposed the censored variable to be independent of the covariable and applied a new data change to proceed with the empirical likelihood inference in linear models. Cheng et al.  developed the empirical likelihood inference in nonlinear models with a right censored response based on the validation data.

In this paper, we consider the model (1.1), where the response Y is randomly censored, and explanatory X with measurement error. We try to construct the confidence region for the parameter β. Similarly, we propose to use the empirical likelihood method to construct the confidence region for β. A typical nonparametric approach to this problem generally includes the following steps: (1) repair the incomplete data with the help of validation data and derive an β̂ to estimate β, (2) define the empirical likelihood function, (3) invert the confidence region by the limiting $\chi _{1}^{2}$ distribution. To complete these steps, we will carry out the following approach. First, based on the validation data, we use the Kaplan-Meier estimate and the kernel estimate to repair the censored response variable Y and explanatory X, which has been measured erroneously. Second, based on the repaired data, we introduce an auxiliary vector suitable to define an estimated empirical log-likelihood function of the unknown parameter which has an asymptotic weighted sum of the $\chi_{1}^{2}$ variables. Because the weights are unknown, we can give the corresponding estimators of the weights. Finally, using the result we can construct the asymptotic confidence regions of β. But the method of estimating weights will reduce the precision of the confidence regions. Further, we adjust the preceding log-likelihood, and it is shown that the adjusted empirical log-likelihood has the asymptotic standard $\chi_{1}^{2}$ distribution.

## Results and discussion

### Definition of estimated empirical function

Suppose the primary data set ${(\widetilde{X}_{i},Z_{i},\delta _{i})_{i=1}^{n}}$ is independent of the validation data set ${(\widetilde{X}_{j},X_{j})_{j=n+1}^{n+m}}$, $E[\varepsilon|X]=0$, and $E[\varepsilon|\widetilde{X}]=0$. Further we suppose $F(\cdot)$ and $G(\cdot)$ are the density functions of Y and C, respectively. Because $\{Y_{i}\}$ is censored, and the completely observed variable $Z_{i}$ has different expectations to $Y_{i}$, we cannot directly use the general method to estimate β. When G is known, we define

$$Y_{iG}=\frac{\delta_{i}Z_{i}}{1-G(Z_{i}-)}.$$

Denote $m(\widetilde{X},\beta)=E[g(X,\beta)|\widetilde{X}]$. It is well known that $E[Y_{iG}|\widetilde{X}_{i}]=E[Y_{i}|\widetilde {X}_{i}]=m(\widetilde{X}_{i},\beta)$. Then based on the complete observed data, from model (1.1) can be switched to the following models:

$$Y_{iG}=m(\widetilde{X}_{i},\beta)+\eta_{i},$$
(2.1)

where $\eta_{i}=Y_{iG}-m(\widetilde{X}_{i},\beta)$. Write

\begin{aligned}& g^{(1)}(X,\beta)=\biggl(\frac{\partial}{\partial\beta}g(X,\beta)\biggr)^{\tau }= \biggl(\biggl(\frac{\partial}{\partial\beta_{1}},\ldots,\frac{\partial}{\partial \beta_{p}}\biggr)g(X,\beta) \biggr)^{\tau}, \\& m^{(1)}(\widetilde{X}_{i},\beta)=\frac{\partial}{\partial\beta }m( \widetilde{X}_{i},\beta)=E\bigl[g^{(1)}(X,\beta)| \widetilde{X}_{i}\bigr]. \end{aligned}

Suppose G is known, introduce an auxiliary random vector

$$W_{i}(\beta)=m^{(1)}(\widetilde{X}_{i},\beta) \bigl[Y_{iG}-m(\widetilde {X}_{i},\beta)\bigr].$$
(2.2)

It is easily shown that $E[W_{i}(\beta)]=E\{m^{(1)}(\widetilde {X}_{i},\beta)E[\eta_{i}|\widetilde{X}_{i}]\}=0$, if β is the true value of the parameter.

In practice, $m(\widetilde{X},\beta)$, $m^{(1)}(\widetilde{X},\beta )$ and G are usually unknown. For establishing the empirical likelihood, their estimators need to be given first. To do this, for G, we employ the Kaplan-Meier estimator

$$\widehat{G}(t)=1-\prod^{n}_{i=1} \biggl[ \frac{n-i}{n-i+1} \biggr]^{I[Z_{(i)}\leq t,\delta_{(i)}=0]},$$

where $Z_{(1)}\leq Z_{(2)}\leq\cdots\leq Z_{(n)}$ are the order statistics of $Z_{1},Z_{2},\ldots, Z_{n}$, and $\delta_{(i)}$ are the indicators associated with $Z_{(i)}$, $i=1,\ldots,n$. Suppose

\begin{aligned}& \widehat{R}_{m}(\tilde{x},\beta)=\frac{1}{mh^{p}}\sum ^{n+m}_{j=n+1}g(X_{j},\beta)K \biggl( \frac{\widetilde{X}_{j}-\tilde {x}}{h} \biggr), \\& \hat{f}_{m}(\tilde{x})=\frac{1}{mh^{p}}\sum ^{n+m}_{j=n+1}K \biggl(\frac{\widetilde{X}_{j}-\tilde{x}}{h} \biggr),\qquad \widehat {R}_{m}^{(1)}(\tilde{x},\beta)=\frac{\partial}{\partial\beta } \widehat{R}_{m}(\tilde{x},\beta). \end{aligned}

Here $K(\cdot)$ is a kernel function and $h=h_{m}$ is a bandwidth tending to 0. Denote $\hat{f}_{b_{m}}(\tilde{x})=\max\{\hat {f}_{m}(\tilde{x}),b_{m}\}$, where ${b_{m}}$ is bounded zero positive numbers. Using the validation data, we define the blocked estimators of $m(\widetilde{X},\beta)$ and $m^{(1)}(\widetilde{X},\beta )$ as follows:

$$\hat{m}(\tilde{x},\beta)=\frac{\widehat{R}_{m}(\tilde {x},\beta)}{\hat{f}_{b_{m}}(\tilde{x})} ,\qquad \hat{m}^{(1)}( \tilde{x},\beta)=\frac{\widehat {R}_{m}^{(1)}(\tilde{x},\beta)}{\hat{f}_{b_{m}}(\tilde{x})}.$$

Use their estimators $\hat{m}(\widetilde{X},\beta)$, $\hat{m}^{(1)}(\widetilde{X},\beta)$, and $\widehat{G}(\cdot)$ to replace the unknown functions $m(\widetilde{X},\beta)$, $m^{(1)}(\widetilde {X},\beta)$, and G in (2.2), and write

$$\widehat{W}_{i}(\beta)=\hat{m}^{(1)}(\widetilde{X}_{i}, \beta )\bigl[Y_{i\widehat{G}}-\hat{m}(\widetilde{X}_{i},\beta) \bigr].$$
(2.3)

It is easily proved that $E[\widehat{W}_{i}(\beta)]=o(1)$ when β is the true value. Using this, an estimated empirical log-likelihood-ratio function is defined as

$$\hat{l}(\beta)=-2\max \Biggl\{ \sum^{n}_{i=1} \log(np_{i}) \bigg|p_{i}\geq0,\sum^{n}_{i=1}p_{i} \widehat{W}_{i}(\beta)=0,\sum^{n}_{i=1}p_{i}=1 \Biggr\} .$$

By introducing the Lagrange multipliers, the most fines value of $p_{i}$ is $p_{i}=n^{-1}(1+\lambda^{\tau}\widehat{W})^{-1}$, where λ is determined by

$$\frac{1}{n}\sum^{n}_{i=1} \frac{\widehat{W}_{i}(\beta)}{1+\lambda^{\tau }\widehat{W}_{i}(\beta)}=0.$$
(2.4)

So $\hat{l}(\beta)$ can be represented as

$$\hat{l}(\beta)=2\sum^{n}_{i=1}\log\bigl(1+ \lambda^{\tau}\widehat {W}_{i}(\beta)\bigr).$$
(2.5)

In succession, we define the β’s estimator by minimizing

$$\hat{\beta}=\arg\min_{\beta}\frac{1}{2}\sum ^{n}_{i=1}\bigl[Y_{i\widehat {G}}-\hat{m}( \widetilde{X}_{i},\beta)\bigr]^{2}.$$
(2.6)

### Construction of confidence region

Throughout this section, we use $c>o$ to represent any constant which does not rely on n and m and may take different values for each appearance. Let $M^{k}$ be a class of all continuous function classes in $R^{p}$ ($k>p$) or subdomains of $R^{p}$ which make the partial derivatives $\frac{\partial^{i_{1}}}{\partial x^{i_{1}}_{1}}\cdot \frac{\partial^{i_{2}}}{\partial x^{i_{2}}_{2}}\cdots\frac{\partial ^{i_{p}}}{\partial x^{i_{p}}_{p}}\varphi(x_{1},\ldots,x_{p})$ uniformly bounded for $o< i_{i}+\cdots+i_{p}\leq k$.

To obtain our result, we need to list the following conditions.

### Condition C

C1.:

$E\|X\|^{2}<\infty$, $EY^{2}<\infty$.

C2.:

$g(x,\beta)$ has bounded continuous partial derivatives up to order two in Γ where Γ is the bounded support of x.

C3.:

For some k (>p), there is $m(\tilde{x},\beta)\in M^{k}$, $m^{(1)}_{s}(\tilde{x},\beta)\in M^{k}$, $s=1,2,\ldots,p$.

C4.:

$K(u)$ is a bounded nonnegative kernel function of order k ($k>p$) with bounded support.

C5.:

The density of , say $f(\tilde{x})$, satisfies

1. (i)

$f(\tilde{x})\in M^{k}$,

2. (ii)

there exists a positive constant sequence $b_{m}$, such that $\sqrt{m}P\{f(\tilde{x})< b_{m}\}\rightarrow0$ as $m\rightarrow\infty$.

C6.:

$mh^{2p}b_{m}^{4}\rightarrow\infty$, $mh^{2k}b_{m}^{-2}\rightarrow 0$, $h^{k-\frac{1}{2}p}b_{m}\rightarrow0$.

C7.:

$\sup_{(\tilde{x},x)}E[e^{2}|\widetilde{X}=\tilde{x},X=x]<\infty$.

C8.:

$\frac{n}{m}\rightarrow\gamma$, where $\gamma>0$ is a nonnegative constant.

C9.:

For any $0\leq s<\infty$, there exists $\Gamma _{1}(s)=E[m^{(1)}(\widetilde{X},\beta)I[s< Y]]$.

C10.:
1. (i)

For any $s\leq\tau_{Q}=\inf{t:Q(t)=1}$, $G(s)$, and $F(s)$ has no common jumps where $Q(t)=P(Z\leq t)$,

2. (ii)

$E{\frac{\|g^{(1)}(X,\beta)\|Y}{[(1-G(Y))(1-F(Y))]^{\frac {1}{2}}}}<\infty$,

3. (iii)

$\int^{\tau_{Q}}_{0}\|H(s)\|\frac {[1-F(s)]}{[1-F(s-)][1-G(s)]}\, dG(s)<\infty$, where $H(s)=\frac {E[m^{(1)}(\widetilde{X},\beta)Y_{G}I[s< Z]]}{[1-F(s-)][1-G(s)]}$.

C11.:

$\Sigma_{0}(\beta)=E{[Y_{G}-m(\widetilde{X},\beta )]^{2}m^{(1)}(\widetilde{X},\beta)(m^{(1)}(\widetilde{X},\beta))^{\tau }}=E[W_{1}(\beta)(W_{1}(\beta))^{\tau}]$, $\Xi=Em^{(1)}(\widetilde{X}, \beta)(m^{(1)}(\widetilde{X},\beta))^{\tau }$, $\Sigma_{1}(\beta)=\int^{+\infty}_{0}H(s)H^{T}(s)\bar {F}(s-)(1-\Lambda^{G}(s))\, dG(S)$, where $\Lambda^{G}(s)= \int^{t}_{-\infty }\bar{G}^{-1}(s-)\, dG(s)$, and $\Sigma_{0}(\beta)$, $\Sigma_{1}(\beta)$, and Ξ are all positive definite matrices.

### Remark 2.1

These conditions are some usual assumptions for studying the semiparametric model and can be satisfied. Here, condition C6 is only explained, h was taken to be $h=c_{2}m^{-\frac{1}{p+k}}$ if $b_{m}=c_{1}m^{\frac{p-k}{4(p+k)}}\log m$, where $c_{1}$ and $c_{2}$ are positive constants.

### Theorem 2.1

Under Condition C, if β is the true value of the parameter, we have

$$\sqrt{n}(\hat{\beta}-\beta)\stackrel{L}{\longrightarrow}N\bigl(0,\Xi ^{-1}\Sigma\Xi^{-1}\bigr),$$

where Ξ is defined in condition C11, $\Sigma=\Sigma(\beta )=\Sigma_{0}(\beta)-\Sigma_{1}(\beta)+\Sigma_{2}(\beta)$, with $\Sigma _{0}(\beta)$, $\Sigma_{1}(\beta)$ being defined in condition C11, and $\Sigma_{2}(\beta)=\gamma E\{m^{(1)}(\widetilde{X},\beta)(m^{(1)}(\widetilde {X},\beta))^{\tau}[m(\widetilde{X},\beta)-g(X,\beta)]^{2}\}$ where γ is defined in condition C8.

### Theorem 2.2

Under Condition C, if β is the true value of the parameter, we have

$$\hat{l}(\beta)\stackrel{L}{\longrightarrow}\omega_{1}\chi ^{2}_{1,1}+\omega_{2}\chi^{2}_{1,2}+ \cdots+\omega_{p}\chi ^{2}_{1,p},$$
(2.7)

where $\chi^{2}_{1,i}$ ($1\leq i\leq p$) are independent standard $\chi ^{2}$ random variables with 1 degree of freedom, and $\omega_{i}$ ($1\leq i\leq p$) are the eigenvalues of $D(\beta)=\Sigma^{-1}_{0}(\beta)\Sigma (\beta)$.

In order to use the result of Theorem 2.2 to construct the confidence regions of β, the corresponding estimators of the unknown weights $\omega_{i}$ ($1\leq i\leq p$) must be given. Denote by the Kaplan-Meier estimator of F, and let $Q_{n}(s)=\frac{1}{n}\sum^{n}_{i=1}I(Z_{i}\leq s)$. Using the plug-in method, we give the following notations:

• $\widehat{H}_{n}(s)=\frac{\frac{1}{n}\sum^{n}_{i=1}\hat{m}^{(1)}(\widetilde{X}_{i},\hat{\beta})Y_{i\widehat {G}}I(s< Z_{i})}{(1-\widehat{G}(s))(1-\widehat{F}(s-))}$,

• $\Lambda^{\widehat{G}}_{n}(s)=\int^{s}_{-\infty}\frac{1}{1-\widehat {G}(t-)}\, d\widehat{G}(t)=\frac{1}{n}\sum^{n}_{i=1}\frac{(1-\delta _{i})I(Z_{i}\leq t)}{(1-\widehat{G}(t-))(1-Q_{n}(t-))}$,

• $\widehat{\Sigma}_{0}(\hat{\beta})=\frac{1}{n}\sum^{n}_{i=1}[Y_{i\widehat{G}}-\hat{m}(\widetilde{X}_{i},\hat{\beta })]^{2}\hat{m}^{(1)}(\widetilde{X}_{i},\hat{\beta})(\hat {m}^{(1)}(\widetilde{X}_{i},\hat{\beta}))^{\tau}=\frac{1}{n}\sum^{n}_{i=1}\widehat{W}_{i}(\beta)\widehat{W}_{i}^{T}(\beta)$,

• $\widehat{\Sigma}_{1}(\beta)=\frac{1}{n}\sum^{n}_{i=1}(1-\delta _{i})\widehat{H}_{n}(Z_{i})\widehat{H}_{n}^{\tau}(Z_{i})(1-\Lambda ^{\widehat{G}}_{n}(Z_{i}))$,

• $\widehat{\Sigma}_{2}(\hat{\beta})=\frac{\gamma}{m}\sum^{n+m}_{j=n+1}\hat{m}^{(1)}(\widetilde{X}_{j},\hat{\beta})(\hat {m}^{(1)}(\widetilde{X}_{j},\hat{\beta}))^{\tau}[\hat{m}(\widetilde {X}_{j},\hat{\beta})-g(X_{j},\beta)]^{2}$,

• $\widehat{\Sigma}(\hat{\beta})=\widehat{\Sigma}_{0}(\hat{\beta })-\widehat{\Sigma}_{1}(\beta)+\widehat{\Sigma}_{2}(\hat{\beta })$.

From this, we can infer that $\hat{\omega}_{i}$ ($i=1,2,\ldots,p$) (the eigenvalues of $\widehat{D}(\hat{\beta})=\widehat{\Sigma}^{-1}_{0}(\hat{\beta })\widehat{\Sigma}(\hat{\beta})$) are the corresponding estimators of $\omega_{i}$ ($i=1,2,\ldots,p$). Denote $\hat{s}=\hat{\omega}_{1}\chi^{2}_{1,1}+\hat{\omega}_{2}\chi ^{2}_{1,2}+\cdots+\hat{\omega}_{p}\chi^{2}_{1,p}$. $B_{f}(\cdot)$ expresses the conditional distribution of ŝ given ${(\widetilde {X}_{i},Z_{i},\delta_{i})^{n}_{i=1}}$ and $(\widetilde {X}_{j},X_{j})^{n+m}_{j=n+1}$. Let $\hat{c}_{\alpha}$ be the $1-\alpha$ fractile of $B_{f}(\cdot)$, and the $1-\alpha$ confidence region for β is

$$\hat{I}_{\alpha}(\tilde{\beta})=\bigl\{ \tilde{\beta}:\hat{l}(\tilde{\beta })\leq\hat{c}_{\alpha}\bigr\} .$$

Actually, it is convenient to obtain the conditional distribution $B_{f}(\cdot)$, we can generate independent samples $\chi^{2}_{1,1},\ldots ,\chi^{2}_{1,p}$ from the $\chi^{2}_{1}$ distribution and then get it through the Monte Carlo simulation method.

### Definition of adjusted empirical function

Applying the result of Theorem 2.2 to construct the confidence region of β, we need to estimate the weights $\omega_{i}$, which will reduce the accuracy of the confidence region. So we need to adjust the empirical likelihood function. Let $r(\beta)=\frac{p}{\operatorname{tr}\{D(\beta)\}}$, according to the result given by Rao and Scott , $r(\beta)\sum^{p}_{i=1}\omega_{i}\chi^{2}_{1,i}$ has an asymptotic standard $\chi ^{2}$-distribution with p degrees of freedom. From Theorem 2.1 and the consistency of $\widehat{\Sigma}(\hat{\beta})$ and $\widehat{\Sigma}_{0}(\hat {\beta})$, we can find that $\hat{r}(\hat{\beta})\hat{l}(\beta)$ also has an asymptotic standard $\chi^{2}$-distribution with p degrees of freedom, where $\hat{r}(\hat{\beta})=\frac{p}{\operatorname{tr}\{\widehat{D}(\hat{\beta})\} }=\frac{p}{\operatorname{tr}\{\widehat{\Sigma}_{0}^{-1}(\hat{\beta})\widehat{\Sigma}(\hat{\beta })\}}$. For improving the approximation accuracy, we replace β̂ with β in $\hat{r}(\hat{\beta})$ and the accuracy will depend on the value of $\omega_{i}$. We can refine the result given by Rao and Scott , and then we give an adjusted empirical log-likelihood ratio. Denote $\widehat{A}(\beta)=\{\sum^{n}_{i=1}\widehat{W}_{i}(\beta)\}\{\sum^{n}_{i=1}\widehat{W}_{i}(\beta)\}^{\tau}$. If we replace $\widehat{\Sigma}(\beta)$ with $\widehat{A}(\beta)$ in $\hat {r}(\beta)$, we will get a new adjustment factor $\hat{\rho}(\beta )=\frac{\operatorname{tr}\{\widehat{\Sigma}^{-1}(\beta)\widehat{A}(\beta)\}}{\operatorname{tr}\{\widehat{\Sigma} _{0}^{-1}(\beta)\widehat{A}(\beta)\}}$, then the adjusted empirical log-likelihood ratio function can be defined as

$$\hat{l}_{\mathrm{ad}}(\beta)=\hat{\rho}(\beta)\hat{l}(\beta).$$

### Theorem 2.3

Under Condition C, if β is the true value of the parameter, we have

$$\hat{l}_{\mathrm{ad}}(\beta)\stackrel{L}{\longrightarrow}\chi^{2}_{p}.$$

Based on Theorem 2.3, $\hat{l}_{\mathrm{ad}}(\beta)$ can be used to construct a confidence region for β,

$$\hat{I}_{\mathrm{ad}}(\tilde{\beta})=\bigl\{ \tilde{\beta}:\hat{l}_{\mathrm{ad}}( \beta)\leq c_{\alpha}\bigr\} ,$$

where $P(\chi^{2}_{p}\leq c_{\alpha})=1-\alpha$, and $P\{\beta\in\hat {I}_{\mathrm{ad}}(\tilde{\beta})\}=1-\alpha+o(1)$.

## Proofs of theorems

Before the proofs of the theorems, we introduce some preliminary results.

### Lemma 3.1

Under Condition C, for any $1\leq i\leq n$, we have

1. (i)

$E[(\hat{m}(\widetilde{X}_{i},\beta)-m(\widetilde{X}_{i},\beta ))^{2}|\widetilde{X}_{i}]\leq c(mh^{p}b^{2}_{m})^{-1}+ch^{2k}b^{-2}_{m}+cI[F(\widetilde {X}_{i})<2b_{m}]$;

2. (ii)

$E[(\hat{m}^{(1)}_{s}(\widetilde{X}_{i},\beta )-m^{(1)}_{s}(\widetilde{X}_{i},\beta))^{2}|\widetilde{X}_{i}]\leq c(mh^{p}b^{2}_{m})^{-1}+ch^{2k}b^{-2}_{m}+cI[F(\widetilde {X}_{i})<2b_{m}]$.

This proof is similar to that of Lemma 1 of Xue . Here we omit the details.

### Lemma 3.2

Under Condition C, if β is the true value of the parameter, we have

$$\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \widehat{W}_{i}(\beta)\stackrel {L}{\longrightarrow}N\bigl(0,\Sigma( \beta)\bigr).$$

### Proof

\begin{aligned} \frac{1}{\sqrt{n}}\sum^{n}_{i=1} \widehat{W}_{i}(\beta) =&\frac{1}{\sqrt {n}}\sum ^{n}_{i=1}\hat{m}^{(1)}(\widetilde{X}_{i}, \beta) \bigl(Y_{i\widehat {G}}-\hat{m}(\widetilde{X}_{i},\beta)\bigr) \\ =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )\bigl[Y_{iG}-m(\widetilde{X}_{i}, \beta)\bigr] \\ &{}+\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )[Y_{i\widehat{G}}-Y_{iG}] \\ &{}+\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \hat{m}^{(1)}(\widetilde{X}_{i},\beta )\bigl[m( \widetilde{X}_{i},\beta)-\hat{m}(\widetilde{X}_{i},\beta) \bigr] \\ &{}+\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \bigl[Y_{i\widehat{G}}-m(\widetilde {X}_{i},\beta)\bigr] \bigl[ \hat{m}^{(1)}(\widetilde{X}_{i},\beta )-m^{(1)}( \widetilde{X}_{i},\beta)\bigr] \\ = :& M_{1}+M_{2}+M_{3}+M_{4}. \end{aligned}
(3.1)

Employing C9-C11, similar to the proof of Lemma 1 in Lai et al. , we have

\begin{aligned} M_{2} =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta ) (Y_{i\widehat{G}}-Y_{iG}) \\ =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )\frac{m^{(1)}(\widetilde{X}_{i},\beta)\delta_{i}Z_{i}}{1-\widehat {G}(Z_{i}-)} \int_{t< Z_{i}}\frac{1-\widehat{G}(t-)}{1-G(t)}\frac{\sum^{n}_{j=1}dM_{j}(t)}{Y_{n}(t)} \\ =&\frac{1}{\sqrt{n}}\sum^{n}_{j=1} \int_{-\infty}^{\tau_{F}}\Biggl\{ \sum ^{N}_{i=1}\frac{m^{(1)}(\widetilde{X}_{i},\beta)\delta _{i}Z_{i}}{1-\widehat{G}(Z_{i}-)}I(Z_{i}>t) \Biggr\} \frac{1-\widehat {G}(t-)}{1-G(t)}\frac{dM_{j}(t)}{Y_{n}(t)}+o_{p}(1) \\ =&-\frac{1}{\sqrt{n}}\sum^{n}_{j=1} \int_{-\infty}^{\tau_{F}}\biggl\{ \int _{s>t}s\, d\Gamma_{1}(s)\biggr\} \frac{dM_{j}(t)}{(1-G(t))(1-F(t))}+o_{p}(1) \\ =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \int_{-\infty}^{\tau _{F}}E\bigl[m^{(1)}( \widetilde{X}_{i},\beta)Y_{iG}I[s< Z_{i}]\bigr] \frac {dM_{j}(t)}{(1-G(t))(1-F(t))}+o_{p}(1) \\ = :& M_{5}+o_{p}(1), \end{aligned}

where $M_{i}(t)=(1-\delta_{i})I(Z_{i}< t)-\int^{t}_{-\infty}I(c_{i}\geq s,y_{i}>s)\, d\Lambda(s)$, $Y_{n}(t)=\sum^{n}_{i=1}I(Z_{i}< t)$. The first item $M_{5}$ is a martingale sequence which has a limiting normal distribution with mean 0 and covariance $\Sigma_{1}$ according to the Rebolledo central limit theorem of martingales. We have

\begin{aligned} M_{3} =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \hat{m}^{(1)}(\widetilde {X}_{i},\beta)\bigl[m( \widetilde{X}_{i},\beta)-\hat{m}(\widetilde {X}_{i},\beta) \bigr] \\ =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )\bigl[m(\widetilde{X}_{i},\beta)- \hat{m}(\widetilde{X}_{i},\beta)\bigr] \\ &{}+\frac{1}{\sqrt{n}}\sum^{n}_{i=1}\bigl[ \hat{m}^{(1)}(\widetilde {X}_{i},\beta)-m^{(1)}( \widetilde{X}_{i},\beta)\bigr] \bigl[m(\widetilde {X}_{i}, \beta)-\hat{m}(\widetilde{X}_{i},\beta)\bigr] \\ = :& M_{31}+M_{32}. \end{aligned}

Write $m_{b}(\tilde{x},\beta)= m(\tilde{x},\beta)f(\tilde {x})f^{-1}_{b}(\tilde{x})$, $f_{b}(\tilde{x})=\max\{f(\tilde {x}),b_{m}\}$, then

\begin{aligned} M_{31} =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde {X}_{i},\beta)\bigl[m(\widetilde{X}_{i},\beta)- \hat{m}(\widetilde {X}_{i},\beta)\bigr] \\ =&\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )\bigl[m_{b}(\widetilde{X}_{i}, \beta)-\hat{m}(\widetilde{X}_{i},\beta)\bigr] \\ &{}+\frac{1}{\sqrt{n}}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )\bigl[m(\widetilde{X}_{i}, \beta)-m_{b}(\widetilde{X}_{i},\beta)\bigr] \\ = :& M_{311}+M_{312}. \end{aligned}

For $\forall\varepsilon>0$, we have

\begin{aligned} P\bigl(\vert M_{312}\vert >\varepsilon\bigr) \leq& P \Biggl\{ \frac{1}{\sqrt{n}}\sum^{n}_{i=1}\bigl\vert m^{(1)}(\widetilde{X}_{i},\beta)\bigr\vert \bigl\vert m( \widetilde{X}_{i},\beta )-m_{b}(\widetilde{X}_{i}, \beta)\bigr\vert >\varepsilon \Biggr\} \\ =& P \Biggl\{ \frac{1}{\sqrt{n}}\sum^{n}_{i=1} \bigl\vert m^{(1)}(\widetilde {X}_{i},\beta)\bigr\vert \bigl\vert m(\widetilde{X}_{i},\beta)\bigr\vert \biggl\vert \frac{f_{b}(\widetilde {X}_{i})-f(\widetilde{X}_{i})}{f_{b}(\widetilde{X}_{i})}\biggr\vert >\varepsilon \Biggr\} \\ \leq& P \Biggl\{ \frac{1}{\sqrt{n}}\sum^{n}_{i=1} \bigl\vert m^{(1)}(\widetilde {X}_{i},\beta)\bigr\vert \bigl\vert m(\widetilde{X}_{i},\beta)\bigr\vert I\bigl[f(\widetilde {X}_{i})< b_{m}\bigr]>\varepsilon \Biggr\} \\ \leq& \frac{1}{\varepsilon}\sqrt{n}P\bigl(f(\widetilde {X}_{i})< b_{m} \bigr)\rightarrow0. \end{aligned}

Hence, $M_{312}=o_{p}(1)$. Write

\begin{aligned}& \zeta_{m}(x)=\frac{1}{mh^{p}}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde {X}_{j},\beta)-g(\widetilde{X}_{j},\beta) \bigr]K \biggl(\frac{\widetilde {X}_{j}-x}{h} \biggr) , \\& \xi_{m}(x)=\frac{1}{mh^{p}}\sum^{n+m}_{j=n+1} \bigl[m(x,\beta)-m(\widetilde {X}_{j},\beta)\bigr]K \biggl( \frac{\widetilde{X}_{j}-x}{h} \biggr) , \\& \phi_{m}(x)=\bigl[f(x)\hat{f}_{b_{m}}(x)-f_{b}(x) \hat {f}_{m}(x)\bigr]f^{-2}_{b}(x) , \\& \Delta_{m}(x)={f}_{b_{m}}(x)-f_{b}(x) . \end{aligned}

A series simple calculation yields

\begin{aligned}& M_{311} = \frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )\zeta_{m}(\widetilde{X}_{i})f^{-1}_{b}( \widetilde{X}_{i}) \\& \hphantom{M_{311} ={}}{} +\frac {1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta)\xi_{m}(\widetilde {X}_{i})f^{-1}_{b}(\widetilde{X}_{i}) \\& \hphantom{M_{311} ={}}{} +\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta)m(\widetilde{X}_{i},\beta)\phi _{m}(\widetilde{X}_{i}) \\& \hphantom{M_{311} ={}}{} +\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta)\bigl[\hat {m}(\widetilde{X}_{i}, \beta)\hat{f}_{b_{m}}(\widetilde {X}_{i})-m( \widetilde{X}_{i},\beta)f(\widetilde{X}_{i})\bigr]\Delta _{m}(\widetilde{X}_{i})f^{-2}_{b}( \widetilde{X}_{i}) \\& \hphantom{M_{311} ={}}{} +\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta)\hat {m}(\widetilde{X}_{i},\beta) \Delta^{2}_{m}(\widetilde {X}_{i})f^{-2}_{b}( \widetilde{X}_{i}) \\& \hphantom{M_{311}} = : \sqrt{n}\sum^{5}_{i=1}J_{mi}, \\& J_{m1} = \frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta)\zeta _{m}(\widetilde{X}_{i})f^{-1}_{b}( \widetilde{X}_{i}) \\& \hphantom{J_{m1}} = \frac {1}{mh^{p}}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde{X}_{j},\beta)-g(X_{j},\beta )\bigr] \int m^{(1)}(x,\beta)K \biggl(\frac{\widetilde{X}_{j}-x}{h} \biggr)f^{-1}_{b}(x)f(x)\,dx \\ & \hphantom{J_{m1} ={}}{} +\frac{1}{mh^{p}}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde{X}_{j},\beta )-g(X_{j},\beta)\bigr] \\ & \hphantom{J_{m1} ={}}{}\times\Biggl[\frac{1}{n}\sum^{n}_{i=1} \frac{m^{(1)}(\widetilde {X}_{i},\beta)K (\frac{\widetilde{X}_{j} -\widetilde{X}_{i}}{h} )}{f_{b}(\widetilde{X}_{i})}- \int\frac {m^{(1)}(x,\beta)K (\frac{\widetilde{X}_{j}-x}{h} )}{f_{b}(x)}f(x)\,dx \Biggr] \\ & \hphantom{J_{m1}}{} = : J_{m11}+J_{m12}, \\ & J_{m11} = \frac{1}{mh^{p}}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde{X}_{j},\beta )-g(X_{j},\beta)\bigr] \int m^{(1)}(x,\beta)K \biggl(\frac{\widetilde {X}_{j}-x}{h} \biggr)\,dx \\ & \hphantom{J_{m1} ={}}{}+\frac{1}{mh^{p}}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde{X}_{j},\beta )-g(X_{j},\beta)\bigr] \int m^{(1)}(x,\beta)K \biggl(\frac{\widetilde {X}_{j}-x}{h} \biggr) \bigl[f(x)-f_{b}(x)\bigr]f^{-1}_{b}(x)\,dx \\ & \hphantom{J_{m1}}{}= : K_{m1}+K_{m2}. \end{aligned}

By conditions C4, C5, and $\sqrt{m}h^{2k}\rightarrow0$, applying a Taylor expansion, we can prove

\begin{aligned} K_{m1} =&\frac{1}{mh^{p}}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde{X}_{j},\beta )-g(X_{j},\beta)\bigr] \int m^{(1)}(\widetilde{X}_{j}+\mu h,\beta)K(\mu )h^{p}\, d\mu \\ =&\frac{1}{m}\sum^{n+m}_{j=n+1}\bigl[m( \widetilde{X}_{j},\beta )-g(X_{j},\beta)\bigr] \int m^{(1)}(\widetilde{X}_{j},\beta)K( \mu)h^{p}\, d\mu+o_{p}\bigl(m^{-\frac{1}{2}}\bigr) \\ =&\frac{1}{m}\sum^{n+m}_{j=n+1}\bigl[m( \widetilde{X}_{j},\beta )-g(X_{j},\beta) \bigr]m^{(1)}(\widetilde{X}_{j},\beta)+o_{p} \bigl(m^{-\frac{1}{2}}\bigr). \end{aligned}

Notice that $f(\tilde{x})-f_{b}(\tilde{x})=0$ when $f(\tilde {x})\geq b_{m}$. By conditions C2-C4, it is easy to see that $E(\sqrt {m}K_{m2})\rightarrow0$. Hence, we have $K_{m2}=o_{p}(m^{-\frac{1}{2}})$.

Using the standard kernel estimation method, we can prove that $J_{m12}=o_{p}(m^{-\frac{1}{2}})$.

Now, let us prove $J_{mi}=o_{p}(m^{-\frac{1}{2}})$, $i=2,\ldots,5$. By conditions C4 and C5, applying a Taylor expansion, we have

\begin{aligned} E\bigl[\xi^{2}_{m}(\widetilde{X}_{i})| \widetilde{X}_{i}\bigr] =&\operatorname{Var}\bigl[\xi _{m}( \widetilde{X}_{i})|\widetilde{X}_{i}\bigr]+\bigl[E\bigl[ \xi_{m}(\widetilde {X}_{i})|\widetilde{X}_{i} \bigr]\bigr]^{2} \\ \leq&\frac{1}{mh^{p}} \int \bigl[m(\widetilde{X}_{i},\beta)-m(\widetilde{X}_{i}+h \mu,\beta )\bigr]^{2}K^{2}(\mu)\, d\mu \\ &{}+\biggl[ \int\bigl[m(\widetilde{X}_{i},\beta)-m(\widetilde{X}_{i}+h \mu,\beta )\bigr]K(\mu)\, d\mu\biggr]^{2} \\ \leq& c\bigl(mh^{p}\bigr)+ch^{2k}. \end{aligned}
(3.2)

According to the condition independence, together with (3.2), it is easily shown that $E(\sqrt{m}J_{m2})^{2}\rightarrow0$, hence $J_{m2}=o_{p}(m^{-\frac{1}{2}})$. Now

\begin{aligned}& J_{m3} = \frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )m(\widetilde{X}_{i},\beta) \phi_{m}(\widetilde{X}_{i}) \\ & \hphantom{J_{m3}} = \frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )m(\widetilde{X}_{i},\beta) \phi_{m}(\widetilde{X}_{i})I\bigl[f(\widetilde {X}_{i})< b_{m},\hat{f}_{m}(\widetilde{X}_{i})< b_{m} \bigr] \\& \hphantom{J_{m3} ={}}{}+\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )m(\widetilde{X}_{i},\beta) \phi_{m}(\widetilde{X}_{i})I\bigl[f(\widetilde {X}_{i})\geq b_{m},\hat{f}_{m}( \widetilde{X}_{i})< b_{m}\bigr] \\& \hphantom{J_{m3} ={}}{}+\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )m(\widetilde{X}_{i},\beta) \phi_{m}(\widetilde{X}_{i})I\bigl[f(\widetilde {X}_{i})< b_{m},\hat{f}_{m}(\widetilde{X}_{i}) \geq b_{m}\bigr] \\& \hphantom{J_{m3}}{}= : J_{m31}+J_{m32}+J_{m33}, \\& J_{m31} = \frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )m(\widetilde{X}_{i}, \beta)b^{-1}_{m}\bigl[f(\widetilde{X}_{i})-\hat {f}_{m}(\widetilde{X}_{i})\bigr]I\bigl[f( \widetilde{X}_{i})< b_{m},-b_{m}< \hat {f}_{m}(\widetilde{X}_{i})< b_{m}\bigr] \\& \hphantom{J_{m31} ={}}{}+\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )m(\widetilde{X}_{i}, \beta)b^{-1}_{m}\bigl[f(\widetilde{X}_{i})-\hat {f}_{m}(\widetilde{X}_{i})\bigr]I\bigl[f( \widetilde{X}_{i})< b_{m},\hat {f}_{m}( \widetilde{X}_{i})< -b_{m}\bigr] \\& \hphantom{J_{m31}}{}= : L_{m1}+L_{m2}. \end{aligned}

By condition C4, we easily get

$$\sup_{\tilde{x}}\bigl|\hat{f}_{m}(\tilde{x})-f(\tilde {x})\bigr|=O_{p}\bigl[\bigl(mh^{p}\bigr)^{-\frac{1}{2}} \bigr]+O_{p}\bigl(h^{k}\bigr).$$
(3.3)

Together with condition C5 and (3.3), and the Markov inequality, it is easily proved that $L_{m1}=o_{p}(m^{-\frac{1}{2}})$. In addition, for $\forall\varepsilon>0$, we have

$$P\bigl(\sqrt{m}\vert L_{m2}\vert >\varepsilon\bigr)\leq P\Bigl( \sup_{\tilde{x}}\bigl\vert \hat {f}_{m}(\tilde{x})-f( \tilde{x})\bigr\vert >b_{m}\Bigr)\rightarrow0.$$

Hence, we have $J_{m31}=o_{p}(m^{-\frac{1}{2}})$. By an argument similar to $J_{m31}$, we obtain

$$\sqrt{m}|J_{m32}|\rightarrow0,\qquad \sqrt{m}|J_{m33}| \rightarrow0.$$

Hence, we have $J_{m3}=o_{p}(m^{-\frac{1}{2}})$. Now

\begin{aligned} J_{m4} =&\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta)\bigl[\hat {m}(\widetilde{X}_{i}, \beta)-m(\widetilde{X}_{i},\beta)\bigr]\Delta ^{2}_{m}( \widetilde{X}_{i})f^{-2}_{b}(\widetilde{X}_{i}) \\ &{}+\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta)\bigl[\hat {m}(\widetilde{X}_{i}, \beta)-m(\widetilde{X}_{i},\beta)\bigr]\Delta _{m}( \widetilde{X}_{i})f^{-1}_{b}(\widetilde{X}_{i}) \\ &{}+\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta )m(\widetilde{X}_{i},\beta) \Delta^{2}_{m}(\widetilde {X}_{i})f^{-2}_{b}( \widetilde{X}_{i}) \\ &{}+\frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta) m(\widetilde{X}_{i},\beta) \Delta_{m}(\widetilde{X}_{i}) \bigl[b_{m}-f( \widetilde{X}_{i})\bigr]b^{-2}_{m}I\bigl[f( \widetilde{X}_{i})< b_{m}\bigr]. \end{aligned}

Notice that

$$\sup_{\tilde{x}}\bigl\vert \Delta_{m}(\tilde{x})\bigr\vert \leq\sup_{\tilde {x}}\bigl\vert \hat{f}_{m}(x)-f( \tilde{x})\bigr\vert =O_{p}\bigl[\bigl(mh^{p} \bigr)^{-\frac {1}{2}}\bigr]+O_{p}\bigl(h^{k} \bigr).$$
(3.4)

Together with $\sqrt{m}P(f(\widetilde{X}_{i})< b_{m})\rightarrow0$, applying conditional independence, we can obtain $J_{m4}=o_{p}(m^{-\frac {1}{2}})$.

Similarly, by conditional independence and (3.4), we can prove that $J_{m5}=o_{p}(m^{-\frac{1}{2}})$. Hence, we have

$$M_{31}=\frac{\sqrt{n}}{m}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde{X}_{j},\beta )-g(X_{j},\beta) \bigr]m^{(1)}(\widetilde{X}_{j},\beta)+o_{p}(1).$$

Consider the sth ($s=1,\ldots,p$) component of $M_{32}$, from the Cauchy-Schwarz inequality and Lemma 3.1, we have

\begin{aligned} E|M_{32s}| =&\frac{1}{\sqrt{n}}E\Biggl\vert \sum ^{n}_{i=1}\bigl[\hat{m}_{s}^{(1)}( \widetilde{X}_{i},\beta)-m_{s}^{(1)}(\widetilde {X}_{i},\beta)\bigr] \bigl[m(\widetilde{X}_{i},\beta)- \hat{m}(\widetilde {X}_{i},\beta)\bigr]\Biggr\vert \\ \leq&\frac{1}{\sqrt{n}}\sum^{n}_{i=1}E \bigl\vert \bigl[\hat{m}_{s}^{(1)}(\widetilde{X}_{i}, \beta)-m_{s}^{(1)}(\widetilde {X}_{i},\beta)\bigr] \bigl[m(\widetilde{X}_{i},\beta)-\hat{m}(\widetilde {X}_{i}, \beta)\bigr]\bigr\vert \\ \leq&\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \bigl\{ E\bigl[\hat{m}_{s}^{(1)}(\widetilde{X}_{i}, \beta)-m_{s}^{(1)}(\widetilde {X}_{i},\beta) \bigr]^{2}E\bigl[m(\widetilde{X}_{i},\beta)-\hat{m}( \widetilde {X}_{i},\beta)\bigr]^{2} \bigr\} ^{\frac{1}{2}} \\ =&o(1). \end{aligned}

Hence, we have $M_{32}=o_{p}(1)$,

\begin{aligned}& M_{3}=\frac{\sqrt{n}}{m}\sum^{n+m}_{j=n+1} \bigl[m(\widetilde{X}_{j},\beta )-g(X_{j},\beta) \bigr]m^{(1)}(\widetilde{X}_{j},\beta)+o_{p} \bigl(m^{-\frac {1}{2}}\bigr)= :M_{6}+o_{p}(1), \\& M_{4} = \frac{1}{\sqrt{n}}\sum^{n}_{i=1} \bigl[Y_{i\widehat{G}}-m(\widetilde {X}_{i},\beta)\bigr] \bigl[ \hat{m}^{(1)}(\widetilde{X}_{i},\beta )-m^{(1)}( \widetilde{X}_{i},\beta)\bigr] \\& \hphantom{M_{4}} = \frac{1}{\sqrt{n}}\sum^{n}_{i=1} \bigl[Y_{iG}-m(\widetilde{X}_{i},\beta )\bigr] \bigl[ \hat{m}^{(1)}(\widetilde{X}_{i},\beta)-m^{(1)}( \widetilde {X}_{i},\beta)\bigr] \\& \hphantom{M_{4} ={}}{}+\frac{1}{\sqrt{n}}\sum^{n}_{i=1}[Y_{i\widehat{G}}-Y_{iG}] \bigl[\hat {m}^{(1)}(\widetilde{X}_{i},\beta)-m^{(1)}( \widetilde{X}_{i},\beta)\bigr] \\& \hphantom{M_{4}}= : M_{41}+M_{42}. \end{aligned}

From Wang [3, 4], we can obtain $M_{42}=o_{p}(1)$.

Write $U=\{X_{j},\widetilde{X}_{j}\}^{n+m}_{j=n+1}$, consider the sth ($s=1,\ldots,p$) component of $M_{41}$, from conditional independence and Lemma 3.1 we have

\begin{aligned} E(M_{41s})^{2} =&\frac{1}{n}E \Biggl\{ \sum ^{n}_{i=1}\bigl[Y_{iG}-m(\widetilde {X}_{i},\beta)\bigr] \bigl[\hat{m}_{s}^{(1)}( \widetilde{X}_{i},\beta )-m_{s}^{(1)}( \widetilde{X}_{i},\beta)\bigr] \Biggr\} ^{2} \\ =&\frac{1}{n}E \Biggl\{ E \Biggl\{ \sum^{n}_{i=1} \eta_{i}\bigl[\hat {m}_{s}^{(1)}( \widetilde{X}_{i},\beta)-m_{s}^{(1)}(\widetilde {X}_{i},\beta)\bigr]^{2}\Big|U \Biggr\} \Biggr\} \\ =&\frac{1}{n}\sum^{n}_{i=1}E\bigl\{ \eta_{i}\bigl[\hat{m}_{s}^{(1)}(\widetilde {X}_{i},\beta)-m_{s}^{(1)}(\widetilde{X}_{i}, \beta)\bigr]\bigr\} ^{2} \\ =&o(1). \end{aligned}

So $M_{41}=o_{p}(1)$, then $M_{4}=o_{p}(1)$. Therefore, we have

$$\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \widehat{W}_{i}(\beta )=M_{1}+M_{5}+M_{6}+o_{p}(1).$$
(3.5)

From the central limit theorem and $\frac{n}{m}\rightarrow\gamma$, we have

\begin{aligned}& M_{1}\stackrel{L}{\longrightarrow}N\bigl(0,\Sigma_{0}( \beta)\bigr),\qquad M_{5}\stackrel {L}{\longrightarrow}N\bigl(0, \Sigma_{1}(\beta)\bigr), \end{aligned}
(3.6)
\begin{aligned}& M_{6}\stackrel{L}{\longrightarrow}N\bigl(0,\Sigma_{2}(\beta) \bigr). \end{aligned}
(3.7)

In addition, $M_{1}$ and $M_{6}$ are independent of each other, $M_{5}$ and $M_{6}$ are independent of each other. Then a simple calculation yields

$$EM_{1}M_{6}=0,\qquad EM_{5}M_{6}=0, \qquad EM_{1}M_{5}\stackrel{P}{\longrightarrow}- \Sigma_{1}(\beta).$$
(3.8)

From the central limit theorem and by (3.6)-(3.8), Lemma 3.2 is proved. □

### Lemma 3.3

Under Condition C, if β is the true value of the parameter, we have

$$\frac{1}{n}\sum^{n}_{i=1} \widehat{W}_{i}(\beta)\widehat{W}_{i}^{\tau }(\beta) \stackrel{P}{\longrightarrow}\Sigma_{0}(\beta).$$

### Proof

After a complex calculation, we have

$$\frac{1}{n}\sum^{n}_{i=1} \widehat{W}_{i}(\beta)\widehat{W}_{i}^{\tau }(\beta)= \frac{1}{n}\sum^{n}_{i=1}m^{(1)}( \widetilde{X}_{i},\beta) \bigl(m^{(1)}(\widetilde{X}_{i}, \beta)\bigr)^{\tau}\eta_{i}^{2}+o_{p}(1).$$

By the law of large numbers, we obtain Lemma 3.3. □

### Lemma 3.4

Let $Z\stackrel{L}{\longrightarrow}N(0,I_{p})$, where $I_{p}$ is the $p\times p$ identity matrix. Let Q be a $p\times p$ nonnegative definite matrix with eigenvalues $\omega _{1},\ldots,\omega_{p}$. Then it follows that

$$Z^{\tau}QZ\stackrel{L}{\longrightarrow}\omega_{1} \chi_{1,1}^{2}+\omega _{2}\chi_{2,1}^{2}+ \cdots+\omega_{p}\chi_{p,1}^{2}.$$

### Proof of Theorem 2.1

We have

$$\hat{\beta}=\arg\min_{\beta}\frac{1}{2}\sum ^{n}_{i=1}\bigl[Y_{i\widehat {G}}-\hat{m}( \widetilde{X}_{i},\beta)\bigr]^{2}.$$

Let

$$L(\beta)=\frac{1}{2}\sum^{n}_{i=1} \bigl[Y_{i\widehat{G}}-\hat{m}(\widetilde {X}_{i},\beta) \bigr]^{2}.$$

Using the Lagrange multiplier method, let

$$\frac{\partial L(\beta)}{\partial\beta}=\biggl(\frac{\partial L(\beta )}{\partial\beta_{1}},\ldots,\frac{\partial L(\beta)}{\partial\beta_{p}}\biggr)=0,$$

that is,

$$\sum^{n}_{i=1}\hat{m}^{(1)}( \widetilde{X}_{i},\beta)\bigl[Y_{i\widehat {G}}-\hat{m}( \widetilde{X}_{i},\beta)\bigr]=0.$$

Then we have

$$\frac{1}{n}\sum^{n}_{i=1} \hat{m}^{(1)}(\widetilde{X}_{i},\hat{\beta }) \bigl[Y_{i\widehat{G}}-\hat{m}(\widetilde{X}_{i},\beta) \bigr]=0.$$
(3.9)

Applying the Taylor expansion to $\hat{m}(\widetilde{X}_{i},\beta)$ and $\hat{m}^{(1)}(\widetilde{X}_{i},\beta)$ in (3.9), we can obtain

\begin{aligned}& \hat{m}^{(1)}(\widetilde{X}_{i},\hat{\beta})= \hat{m}^{(1)}(\widetilde {X}_{i},\beta)+o_{p}(1), \\& \hat{m}(\widetilde{X}_{i},\hat{\beta})=\hat{m}(\widetilde{X}_{i}, \beta )+\hat{m}^{(1)}\bigl(\widetilde{X}_{i},\beta+\theta(\hat{ \beta}-\beta)\bigr) (\hat {\beta}-\beta)+o_{p}(1), \\& 0=\frac{1}{n}\sum^{n}_{i=1}\bigl[ \hat{m}^{(1)}(\widetilde{X}_{i},\beta )+o_{p} \bigl(n^{\frac{1}{2}}\bigr)\bigr] \\& \hphantom{0={}}{}\times\bigl[Y_{i\widehat{G}}-\hat{m}(\widetilde {X}_{i},\beta)-\hat{m}^{(1)}\bigl(\widetilde{X}_{i}, \beta+\theta(\hat{\beta }-\beta)\bigr) (\hat{\beta}-\beta)+o_{p} \bigl(n^{-\frac{1}{2}}\bigr)\bigr] \\& \hphantom{0}=\frac{1}{n}\sum^{n}_{i=1} \hat{m}^{(1)}(\widetilde{X}_{i},\beta )\bigl[Y_{i\hat{G}}- \hat{m}(\widetilde{X}_{i},\beta)\bigr] \\& \hphantom{0={}}{}-\frac{1}{n}\sum ^{n}_{i=1}\hat{m}^{(1)}(\widetilde{X}_{i}, \beta)\hat{m}^{(1)}\bigl(\widetilde {X}_{i},\beta+\theta(\hat{ \beta}-\beta)\bigr) (\hat{\beta}-\beta )+o_{p}\bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}

So

$$\hat{\beta}-\beta=\widehat{\Xi}^{-1}(\beta) \Biggl\{ \frac{1}{n} \sum^{n}_{i=1}\hat{m}^{(1)}( \widetilde{X}_{i},\beta)\bigl[Y_{i\widehat{G}}-\hat {m}( \widetilde{X}_{i},\beta)\bigr] \Biggr\} +O_{p} \bigl(n^{-\frac{1}{2}}\bigr),$$

where $\widehat{\Xi}^{-1}(\beta)=\frac{1}{n}\sum^{n}_{i=1}\hat {m}^{(1)}(\widetilde{X}_{i},\beta)\hat{m}^{(1)}(\widetilde{X}_{i},\beta +\theta(\hat{\beta}-\beta))(\hat{\beta}-\beta)$, $\theta\in(0,1)$ is a constant. This, together with Lemma 3.2, easily proves that $\widehat{\Xi}^{-1}(\beta )\stackrel{P}{\longrightarrow}\Xi(\beta)$. That is,

$$\hat{\beta}-\beta=\frac{1}{n}\widehat{\Xi}^{-1}(\beta)\sum ^{n}_{i=1}\widehat {W}_{i}( \beta)+O_{p}\bigl(n^{-\frac{1}{2}}\bigr).$$

From Lemmas 3.2, 3.3, and 3.4, we can obtain

$$\sqrt{n}(\hat{\beta}-\beta)\stackrel{L}{\longrightarrow}N\bigl(0,\Xi ^{-1}\Sigma(\beta)\Xi^{-1}\bigr).$$

Theorem 2.1 is proved. □

### Proof of Theorem 2.2

Applying a Taylor expansion to equation (2.3), we have

\begin{aligned} \hat{l}(\beta) =&2\sum^{n}_{i=1} \log\bigl(1+\lambda^{\tau}\widehat {W}_{i}(\beta)\bigr) \\ =&2\sum^{n}_{i=1}\biggl[\log1+\bigl( \log'x\bigr)_{x=1}\bigl(\lambda^{T}\widehat {W}_{i}\bigr)+\frac{(\log x)^{(2)}|_{x=1}}{2}\bigl(\lambda^{\tau}\widehat {W}_{i}\bigr)^{2}+\frac{(\log x)^{(3)}|_{x=\xi}}{3}\bigl( \lambda^{\tau}\widehat {W}_{i}\bigr)^{3}\biggr] \\ & (0< \xi< 1) \\ =&2\sum^{n}_{i=1}\biggl[ \lambda^{\tau}\widehat{W}_{i}-\frac{1}{2}\bigl(\lambda ^{\tau}\widehat{W}_{i}\bigr)^{2}\biggr]+o_{p}(1). \end{aligned}
(3.10)

By Lemmas 3.2 and 3.3, using the result of Wang , we can prove

$$\left \{ \textstyle\begin{array}{l} \max_{1\leq i\leq n}\widehat{W}_{i}(\beta )=o_{p}(n^{\frac{1}{2}}), \\ \lambda=O_{p}(n^{-\frac{1}{2}}), \\ \frac{1}{n}\sum^{n}_{i=1}\widehat{W}_{i}(\beta)\widehat{W}^{\tau }_{i}(\beta)=o_{p}(1). \end{array}\displaystyle \right .$$
(3.11)

Again,

\begin{aligned} 0 =&\frac{1}{n}\sum^{n}_{i=1} \frac{\widehat{W}_{i}(\beta)}{1+\lambda ^{\tau}\widehat{W}_{i}(\beta)} \\ =&\frac{1}{n}\sum^{n}_{i=1} \frac {\widehat{W}_{i}(\beta)(1+\lambda^{\tau}\widehat{W}_{i}(\beta))-\widehat {W}_{i}(\beta)\lambda^{\tau}\widehat{W}_{i}(\beta)}{1+\lambda^{\tau }\widehat{W}_{i}(\beta)} \\ =&\frac{1}{n}\sum^{n}_{i=1}\widehat {W}_{i}(\beta)-\frac{1}{n}\sum^{n}_{i=1} \lambda^{\tau}\bigl(\widehat {W}_{i}(\beta)\bigr)^{2}+ \frac{1}{n}\sum^{n}_{i=1} \frac{\widehat{W}_{i}(\beta )\lambda^{\tau}\lambda\widehat{W}_{i}(\beta)\widehat{W}^{\tau}_{i}(\beta )}{1+\lambda^{\tau}\widehat{W}_{i}(\beta)}. \end{aligned}
(3.12)

By (3.11), we can obtain

$$\frac{1}{n}\sum^{n}_{i=1} \lambda^{\tau}\widehat{W}_{i}(\beta)=\frac {1}{n}\sum ^{n}_{i=1}\lambda^{\tau}\bigl( \lambda^{\tau}\widehat{W}_{i}(\beta )\bigr)^{2}+o_{p}(1).$$
(3.13)

By (3.13), we can obtain

$$\lambda= \Biggl(\sum^{n}_{i=1} \widehat{W}_{i}(\beta)\widehat{W}^{\tau }_{i}(\beta) \Biggr)^{-1} \Biggl(\sum^{n}_{i=1} \widehat{W}_{i}(\beta) \Biggr)+o_{p}\bigl(n^{-\frac{1}{2}} \bigr).$$
(3.14)

By (3.10), (3.13), (3.14), we can obtain

\begin{aligned} \hat{l}(\beta) =&\lambda^{\tau}\Biggl(\sum ^{n}_{i=1}\widehat{W}_{i}(\beta ) \Biggr)+o_{p}(1) \\ =& \Biggl(\frac{1}{\sqrt{n}}\sum^{n}_{i=1} \widehat{W}_{i}(\beta) \Biggr)^{\tau}\widehat{ \Sigma}^{-1}_{0}(\beta) \Biggl(\frac{1}{\sqrt{n}}\sum ^{n}_{i=1}\widehat{W}_{i}(\beta) \Biggr)+o_{p}(1). \end{aligned}
(3.15)

By (3.15) and Lemmas 3.2, 3.3, and 3.4, we can obtain

$$\hat{l}(\beta)= \Biggl\{ \frac{1}{\sqrt{n}}\Sigma^{-\frac{1}{2}}(\beta )\sum ^{n}_{i=1}\widehat{W}_{i}(\beta) \Biggr\} ^{\tau}D(\beta) \Biggl\{ \frac {1}{\sqrt{n}}\Sigma^{-\frac{1}{2}}( \beta)\sum^{n}_{i=1}\widehat {W}_{i}(\beta) \Biggr\} +o_{p}(1),$$

where $D(\beta)=\Sigma^{\frac{1}{2}}(\beta)\Sigma_{0}^{-1}(\beta)\Sigma ^{\frac{1}{2}}(\beta)$.

Write $\widetilde{D}=\operatorname{diag}(\omega_{1},\ldots,\omega_{p})$ where $\omega _{i}$ ($i=1,\ldots,p$) are the eigenvalues of D, then there exists an orthogonal matrix Q which makes $Q^{\tau}\widetilde{D}Q=D(\beta)$. Therefore, we have

$$\hat{l}(\beta)= \Biggl\{ \frac{1}{\sqrt{n}}Q\Sigma^{-\frac{1}{2}}(\beta )\sum ^{n}_{i=1}\widehat{W}_{i}(\beta) \Biggr\} ^{\tau}\widetilde{D}(\beta ) \Biggl\{ \frac{1}{\sqrt{n}}Q \Sigma^{-\frac{1}{2}}(\beta)\sum^{n}_{i=1} \widehat{W}_{i}(\beta) \Biggr\} +o_{p}(1).$$
(3.16)

From Lemma 3.2, we have

$$\frac{1}{\sqrt{n}}Q\Sigma^{-\frac{1}{2}}(\beta)\sum ^{n}_{i=1}\widehat {W}_{i}(\beta) \stackrel{L}{\longrightarrow}N(0,I_{p}).$$
(3.17)

By (3.16) and (3.17), we finish the proof of Theorem 2.2. □

### Proof of Theorem 2.3

Review the definition of $\hat {l}_{\mathrm{ad}}(\beta)$, combined with (3.16), we can infer

$$\hat{l}_{\mathrm{ad}}(\beta)= \Biggl\{ \frac{1}{\sqrt{n}}\sum ^{n}_{i=1}\widehat {W}_{i}(\beta) \Biggr\} ^{\tau}\widehat{\Sigma}^{-1}(\beta) \Biggl\{ \frac {1}{\sqrt{n}} \sum^{n}_{i=1}\widehat{W}_{i}( \beta) \Biggr\} +o_{p}(1).$$

Combined with the process of the above proof, similar to the proof of Lemma 3.2, it is easily shown that $\widehat{\Sigma}(\beta)\stackrel {P}{\longrightarrow}\Sigma(\beta)$.

By Lemmas 3.2, 3.4, and (3.17), we can prove

$$\hat{l}_{\mathrm{ad}}(\beta)\stackrel{L}{\longrightarrow}\chi^{2}_{p}.$$

□

## Conclusions

In this text, with the help of validation data, the empirical likelihood method is extended to the nonlinear error-in-variables regression models with randomly censored response. We construct an estimated empirical log-likelihood function $\hat{l}(\beta)$ of the unknown interesting parameter β, and we get the asymptotic distribution of $\hat{l}(\beta)$. By giving estimators of the weights $\omega_{i}$ and Monte Carlo simulation method, we construct the confidence region for the parameter β. To avoid estimating $\omega_{i}$, the adjusted empirical log-likelihood $\hat {l}_{\mathrm{ad}}(\beta)$ has been defined and the asymptotic distribution of $\hat{l}_{\mathrm{ad}}(\beta)$ is obtained. Using the result, we may better construct a confidence region for the parameter β. The theoretical analysis shows that the empirical likelihood method can be applied to the nonlinear model with complex incomplete data, and it can be used to obtain a better result.

## References

1. 1.

Sepanski, JH, Lee, LF: Semiparametric estimation of nonlinear error-in-variables models with validation study. J. Nonparametr. Stat. 4, 365-394 (1995)

2. 2.

Wang, QH: Estimation of linear error-in-covariables model with validation data under random censorship. J. Multivar. Anal. 74, 245-266 (1999)

3. 3.

Wang, QH, Rao, JNK: Empirical likelihood-based in linear error-in-covariables model with validation data. Biometrika 89(2), 345-358 (2002)

4. 4.

Wang, QH: Survival Data Analysis. Science Press, Beijing (2006)

5. 5.

Xue, L: Empirical likelihood inference in nonlinear semiparametric EV models with validation data. Acta Math. Sin. 49(1), 145-154 (2006)

6. 6.

Fang, L, Hu, F: Empirical likelihood dimension reduction inference in nonlinear EV models with validation data. J. Math. 32(1), 113-120 (2012)

7. 7.

Buckley, J, James, I: Linear regression with censored data. Biometrika 66, 429-436 (1979)

8. 8.

Li, G, Wang, QH: Empirical likelihood confidence regression analysis with right censored data. Stat. Sin. 13(1), 51-68 (2003)

9. 9.

Cheng, F, Li, G, Feng, S, Xue, L: Empirical likelihood inference for nonlinear regression model with right censored data. Acta Math. Appl. Sin. 33(1), 130-140 (2010)

10. 10.

Rao, JNK, Scott, AJ: The analysis of categorical data from complex sample surveys: chi-squared test for goodness of fit and independence in two-way tables. J. Am. Stat. Assoc. 76, 221-230 (1981)

11. 11.

Lai, TL, Ying, ZL, Zheng, ZK: Asymptotic normality of a class of adaptive statistic with applications to synthetic data methods for censored regression. J. Multivar. Anal. 52, 259-279 (1995)

## Acknowledgements

The authors are grateful to the referee for carefully reading the manuscript and for offering comments which enabled them to improve the paper. The research of Y Wu was supported by the Humanities and Social Sciences Foundation for the Youth Scholars of Ministry of Education of China (12YJCZH217) and the Key NSF of Anhui Educational Committe (KJ2014A255). The research of L Fang was supported the Anhui provincial natural science research project of university foundation (KJ2013B304) and the natural science research project of the Tongling University foundation (2012tlxy13).

## Author information

Correspondence to Yongfeng Wu.

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

## Rights and permissions 