• Research
• Open Access

# Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions

Journal of Inequalities and Applications20152015:218

https://doi.org/10.1186/s13660-015-0725-1

• Received: 8 November 2014
• Accepted: 2 June 2015
• Published:

## Abstract

In this paper, we employ the empirical likelihood method to estimate the unknown parameters in Poisson autoregressive model in the presence of auxiliary information. It is shown that our approach proposed, compared to the maximum likelihood estimator, the least squares estimator, and the weighted least squares estimator, yields more efficient estimators. Some simulation studies are also conducted to investigate the finite sample performance of the proposed method.

## Keywords

• conditional moment restriction
• empirical likelihood
• least square estimator
• asymptotic distribution
• Poisson autoregressive model

• 62M10
• 62M05

## 1 Introduction

Integer-valued time series occur in many different situations. They arise, for example, as the number of births at a hospital in successive months, the number of bases of DNA sequences, the number of road accidents and the number of diseases in a certain area in successive months. Therefore, in recent years, there has been growing interest in studying integer-valued time series . In order to model the number of cases of campylobacterosis infections from January 1990 to the end of October 2000 in the north of the Province of Québec, Ferland et al.  proposed the following Poisson autoregressive model:
$$\left \{ \textstyle\begin{array}{@{}l} X_{t}|\mathscr{F}_{t-1}: \mathcal{P}(\lambda_{t}); \quad \forall t\in\mathbb{Z},\\ \lambda_{t}=\alpha_{0}+\sum_{i=1}^{p}\alpha_{i}X_{t-i}, \end{array}\displaystyle \right .$$
(1.1)
where $$\mathscr{F}_{t-1}$$ is the σ field generated by $$(X_{t-1},X_{t-2},\ldots)$$, $$\alpha_{0}>0$$, $$\alpha_{i}\geq0$$ ($$i=1, 2, \ldots , p$$), and $$\alpha=(\alpha_{0}, \alpha_{1}, \ldots, \alpha_{p})$$ is an unknown parameter vector.
For model (1.1), Zhu and Wang  gave the condition for ergodicity and a necessary and sufficient condition for the existence of moments. They also established the asymptotics for the maximum likelihood estimator and the least squares estimators. The problem of interest here is to estimate the unknown parameter in model (1.1) by using the empirical likelihood method when auxiliary information is available. In practice, some auxiliary information can often be obtained, such as that the unknown distribution is symmetric or the variance of population is a function of the mean. By making full use of the auxiliary information, we can increase the precision of statistical inference. Here, we assume that we have some auxiliary information that can be represented as the following conditional moment restrictions:
$$E \bigl(g(X_{t},\ldots,X_{t-p}; \theta_{0})|X(t-1) \bigr)=0$$
(1.2)
for each $$t=0, 1, 2,\ldots$$ , where the unknown parameter vector $$\theta _{0}\in R^{d}$$, $$X(t-1)=(X_{t-1}, \ldots, X_{t-p})$$ and $$g(x;\theta)\in R^{r}$$ is some function with $$r\geq d$$. In order to simplify the notation, we further denote $$g(X_{t},\ldots ,X_{t-p};\theta)$$ by $$g_{t}(\theta)$$. We note here that θ can be different from α and the notion $$g_{t}(\theta)$$ contains a broad class of information that can be formulated from the knowledge on the probability distribution of $$X_{t}$$, e.g. the moment and their generalizations . By using the conditional moment restrictions, as we expect, we can increase the efficiency of the resulting estimator .

The EL as an alternative to the bootstrap for constructing confidence regions was introduced by Owen [11, 12]. The method defines an EL ratio function to construct confidence regions. Important features of the empirical likelihood method are its automatic determination of the shape and orientation of the confidence region by the data. These attractive properties have motivated various authors to extend the empirical likelihood methodology to other situations. To use the auxiliary information, some statisticians have also developed some statistical inference methods under the framework of empirical likelihood method . In this paper, we further generalize these methods to the statistical inference of time series models. Specifically, based on the empirical likelihood method, we consider the parameter estimation problem for Poisson autoregressive model with conditional moment restrictions. Our approach yields more efficient estimates compared to the maximum likelihood estimator, the least squares estimator and the weighted least squares estimator, which do not utilize the conditional moment restrictions. Based on the mean square errors, a comparison is also made by simulation. Our simulation indicates that the use of auxiliary information provides improved inferences.

The rest of this paper is organized as follows. In Section 2, we introduce the methodology and the main results. Simulation results are reported in Section 3. Section 4 provides the proofs of the main results.

The symbols ‘$$\stackrel{d}{\longrightarrow}$$’ and ‘$$\stackrel{p}{\longrightarrow}$$’ denote convergence in distribution and convergence in probability, respectively. Convergence ‘almost surely’ is written as ‘a.s.’ . Furthermore, ‘$$M^{\tau}_{k\times p}$$’ denotes the transpose matrix of the $$k\times p$$ matrix $$M_{k\times p}$$, $$A\otimes B$$ denotes the Kronecker product of matrices A and B, and $$\|\cdot\|$$ denotes Euclidean norm of the matrix or vector.

## 2 Methodology and main results

In this section, we will first discuss how to apply the empirical likelihood method [16, 17] to estimate the unknown parameter α when auxiliary information is available.

Before we state our main results, the following assumptions will be made:
(A1):

The parametric space ϒ is compact with $$\Upsilon=\{\alpha: \delta\leq\alpha_{0}\leq M, 0\leq \alpha_{1}+\cdots +\alpha_{p}\leq M^{\ast}<1, \alpha_{i}\geq0, i=1, 2, \ldots, p \}$$, where δ and M are finite positive constants, and the true parameter value $$\alpha^{0}$$ is an interior point in ϒ.

### Remark 1

It is shown by Corollary 1 in Ferland et al.  and Theorem 1 and Theorem 2 in Zhu and Wang  that, under (A1), $$\{X_{t}, t\geq1\}$$ is stationary and ergodic, and $$E(X_{t}^{m})<\infty$$ for any fixed positive integer m.

(A2):

There exists $$\theta_{0}$$ such that $$E(g_{t}(\theta_{0}))=0$$, the matrix $$\Sigma(\theta)=E(g_{t}(\theta) g_{t}^{\tau}(\theta))$$ is positive definite at $$\theta_{0}$$, $$\partial g(x;\theta) /\partial\theta$$ is continuous in a neighborhood of the true value $$\theta_{0}$$, $$\| \partial g(x;\theta) /\partial\theta\|$$ and $$\|g(x;\theta)\|^{3}$$ are bounded by some integrable function $$\tilde{W}(x)$$ in this neighborhood, and the rank of $$E(\partial g_{t}(\theta) /\partial\theta)$$ is d.

First, the conditional moment restrictions in (1.2) imply that $$E(g_{t}(\theta_{0}))=0$$. Further, by using the empirical likelihood method, we can obtain data adaptive weights $$\omega_{t}$$ through
\begin{aligned} L(\theta) =\sup \Biggl\{ \prod_{t=1}^{n} \omega_{t}: \omega_{t}\geq0, \sum _{i=1}^{n}\omega_{t}=1, \sum _{t=1}^{n}\omega_{t}g_{t}(\theta _{0})=0 \Biggr\} , \end{aligned}
where $$\theta_{0}$$ is an unknown parameter. By using the auxiliary information combining with the least squares method, we propose to estimate α by
$$\hat{\alpha}=\arg\min_{\alpha}\sum _{t=1}^{n}\omega _{t} \bigl(X_{t}-Z_{t}^{\tau}\alpha \bigr)^{2},$$
(2.1)
where $$Z_{t}^{\tau}=(1, X_{t-1}, \ldots, X_{t-p})$$. By introducing a Lagrange multiplier $$\lambda\in R^{r}$$, standard derivations in the empirical likelihood lead to
$$\omega_{t}(\theta_{0})=\frac{1}{n} \frac{1}{1+\lambda_{\theta_{0}}^{\tau}g_{t}(\theta_{0})},$$
(2.2)
where $$\lambda_{\theta_{0}}$$ satisfies
$$\frac{1}{n}{}\sum_{t=1}^{n} \frac{g_{t}(\theta_{0})}{1+\lambda_{\theta _{0}}^{\tau}g_{t}(\theta_{0})}=0.$$
(2.3)
Utilizing the weights given by (2.2), we obtain the estimate of α:
$$\hat{\alpha}= \Biggl(\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t} Z_{t}^{\tau}\Biggr)^{-1} \sum _{t=1}^{n} \omega_{t}({ \theta_{0}})X_{t}Z_{t}.$$
(2.4)

In the following, we will give the asymptotic properties of $$\hat{\alpha}$$.

### Theorem 2.1

Assume that (A1) and (A2) hold. If $$\alpha^{0}$$ is the true value of α, then
\begin{aligned} \sqrt{n} \bigl(\hat{\alpha}-\alpha^{0} \bigr)\stackrel{d}{ \longrightarrow }N \bigl(0, W^{-1} \bigl(\Lambda-\Lambda_{12} \Sigma^{-1}(\theta_{0})\Lambda^{\tau}_{12} \bigr)W^{-1} \bigr), \end{aligned}
where $$W=E(Z_{t} Z_{t}^{\tau})$$, $$\Lambda=E(Z_{t} Z_{t}^{\tau}(X_{t}-Z_{t}^{\tau}\alpha ^{0})^{2})$$, and $$\Lambda_{12}=E(Z_{t}g_{t}^{\tau}(\theta_{0})(X_{t}-Z_{t}^{\tau}\alpha^{0}))$$.

Zhu and Wang  prove that the asymptotic variance of the ordinary least squares estimator $$\bar{\alpha}=(\sum_{t=1}^{n}Z_{t} Z_{t}^{\tau})^{-1}\sum_{t=1}^{n} X_{t}Z_{t}$$ is $$W^{-1}\Lambda W^{-1}$$. Note that Λ and $$\Sigma^{-1}(\theta_{0})$$ are both positive definite matrices. Therefore, by Theorem 2.1, we find that a variance reduction quantified by $$W^{-1}\Lambda_{12}\Sigma ^{-1}(\theta_{0})\Lambda^{\tau}_{12}W^{-1}$$ is induced by incorporating the auxiliary information $$E(g_{t}(\theta_{0})|X(t-1))=0$$.

To apply the proposed estimator (2.4), we need to further estimate the unknown parameter θ. We consider $$\hat{\theta}=\arg \max_{\theta}L(\theta)$$. Following the results in Qin and Lawless , the corresponding weights $$\{\omega_{t}\}_{t=1}^{n}$$ satisfy
$$\omega_{t}(\hat{\theta})=\frac{1}{n} \frac{1}{1+\lambda_{\hat{\theta }}^{\tau}g_{t}(\hat{\theta})},$$
(2.5)
where $$\lambda_{\hat{\theta}}$$ is the solution to
$$\frac{1}{n}\sum_{t=1}^{n} \frac{g_{t}(\hat{\theta})}{1+\lambda_{\hat {\theta}}^{\tau}g_{t}(\hat{\theta})}=0$$
and $$(\lambda_{\hat{\theta}},\hat{\theta})$$ solves
$$\frac{1}{n}\sum_{t=1}^{n} \frac{\frac{\partial g_{t}( {\theta })}{\partial\theta^{\tau}}\lambda_{\hat{\theta}}}{1+\lambda_{\hat{\theta }}^{\tau}g_{t}(\hat{\theta})}=0.$$
Let
\begin{aligned} \hat{\alpha}^{1}=\arg\min_{\alpha}\sum _{t=1}^{n}\omega_{t}(\hat { \theta}) \bigl(X_{t}-Z_{t}^{\tau}\alpha \bigr)^{2}, \end{aligned}
(2.6)
where $$\{\omega_{t}(\hat{\theta})\}_{t=1}^{n}$$ is identified by (2.5). When $$r=d$$, we know that $$\omega_{t}(\hat{\theta})=\frac{1}{n}$$ and hence $$\hat{\alpha}^{1}$$ is the ordinary least squares estimator. When $$r>d$$, $$\omega_{t}(\hat{\theta})$$ is no longer equal to $$\frac{1}{n}$$ and we shall show that this scheme provides an efficiency gain over the conventional least squares estimator.

In order to study the estimator (2.6), we define $$\Gamma(\theta _{0})=E(\frac{\partial g_{t}(\theta_{0})}{\partial\theta})$$, $$\Omega(\theta _{0})=(\Gamma(\theta_{0})\Sigma^{-1}(\theta_{0}) \Gamma^{\tau}(\theta_{0}) )^{-1}$$ and $$B=\Sigma^{-1}(\theta_{0})(I-\Gamma(\theta_{0})\Omega(\theta_{0})\Gamma ^{\tau}(\theta_{0})\Sigma^{-1}(\theta_{0}) )$$, where I is the identity matrix. The limiting distribution of $$\hat{\alpha}^{1}$$ is given in the following theorem.

### Theorem 2.2

Assume that (A1) and (A2) hold. If $$\alpha^{0}$$ is the true value of α, then
\begin{aligned} \sqrt{n} \bigl(\hat{\alpha}^{1}-\alpha^{0} \bigr)\stackrel{d}{\longrightarrow}N \bigl(0, W^{-1} \bigl(\Lambda- \Lambda_{12}B\Lambda^{\tau}_{12} \bigr)W^{-1} \bigr). \end{aligned}
(2.7)

The matrix B is non-negative definite. Hence the asymptotic variance of $$\hat{\alpha}^{1}$$ is no greater than that of the least squares estimator. When B is positive definite, variance reduction is attained. This implies that having much more auxiliary information can improve the least squares estimator.

## 3 Simulation study

In this section we conduct some simulation studies which show that our proposed methods perform very well. Consider the following one order Poisson autoregressive model:
$$\left \{ \textstyle\begin{array}{@{}l} X_{t}|\mathscr{F}_{t-1}: \mathcal{P}(\lambda_{t}); \quad \forall t\in\mathbb{Z},\\ \lambda_{t}=\alpha_{0}+ \alpha_{1}X_{t-1}. \end{array}\displaystyle \right .$$
(3.1)
In order to compare the performance of the estimator (denoted by ALS) given by (2.6) with those of the ordinary least squares estimator (LS), the weighted least squares estimator (WLS), and the maximum likelihood estimation (MLE), we compute the mean square errors based on the four methods: the ALS, the LS, the WLS, and the MLE. We use the vector $$g_{t}(\alpha_{0}, \alpha_{1})=(1, X_{t-1}, X^{2}_{t-1})^{\tau}(X_{t}-\alpha_{0}-\alpha_{1} X_{t-1})$$ as the conditional moment restrictions in (1.2). Specifically, for a particular pair of $$(\alpha_{0}, \alpha_{1})^{\tau}$$, we generate realizations from (3.1) with $$n=100, 300\mbox{ and }1{,}000$$. Further, based on 1,000 repetitions, we compute the mean square errors of the above four kinds of estimators. The simulation results for $$\alpha_{0}=1$$ are summarized in Table 1. Table 2 presents the simulation results for $$\alpha_{0}=2$$.

From Tables 1 and 2, we see that the mean square errors obtained by the estimator (2.6) are less than those of the maximum likelihood estimator, the least squares estimator and the weighted least squares estimator. This indicates that using the conditional moment restrictions, the estimates are more accurate, regardless of the samples size and the different unknown parameter.

## 4 Proofs of the main results

In order to prove Theorem 2.1, we first present several lemmas.

### Lemma 4.1

Assume that (A1) and (A2) hold. Then
$$\max_{1\leq t\leq n} \bigl\| g_{t}( \theta_{0})\bigr\| =o_{p} \bigl(n^{\frac{1}{2}} \bigr).$$
(4.1)

### Proof

Let $$\Sigma_{n}(\theta_{0})=\frac{1}{n}\sum_{t=1}^{n}(g_{t}(\theta_{0})g_{t}^{\tau}(\theta_{0}))$$ and $$g_{t}(\theta_{0})=( g_{t1}(\theta_{0}),\ldots, g_{tr}(\theta_{0}))^{\tau}$$. Note that in order to prove (4.1), we need only to prove that
$$\frac{1}{n}\max_{1\leq t\leq n} g_{tk}( \theta_{0})\stackrel{p}{\longrightarrow}0,\quad k=1,\ldots,r.$$
(4.2)
We denote the $$(h,l)$$th element of $$\Sigma(\theta_{0})$$ as $$\sigma_{hl}$$, $$h,l=1,2,\ldots,r$$. By the ergodic theorem, we have
$$\Sigma_{n}(\theta_{0})\stackrel{p}{ \longrightarrow}\Sigma(\theta_{0}).$$
(4.3)
For all $$1\leq k\leq r$$ and $$1\leq m\leq n$$, define the sets
$$B^{k}_{n, m}=\bigcap_{j=1}^{m}B^{k,j}_{n, m},$$
where $$B^{k,j}_{n, m}=\{\omega:|\frac{1}{n}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}(\theta _{0})-\frac{j}{m}\sigma_{kk} |\leq\frac{1}{m}\}$$, $$j=1,\ldots,m$$.
First we show that
$$\lim_{n\rightarrow\infty}P \bigl(B^{k,j}_{n, m} \bigr)=1,\quad j=1, \ldots, m.$$
(4.4)
After some algebra, we obtain
$$B^{k,j}_{n,m}= \Biggl\{ \omega:\Biggl|\frac{[n\frac{j}{m}]}{n} \frac{m}{j}\frac {1}{[n\frac{j}{m}]}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}( \theta_{0})-\sigma_{kk} \Biggr|\leq\frac{1}{m} \frac{m}{j} \Biggr\} .$$
Observe that
$$\frac{[n\frac{j}{m}]}{n}\frac{m}{j}\rightarrow1,\quad n \rightarrow \infty.$$
(4.5)
Furthermore, by (4.3), we have
$$\frac{1}{[n\frac{j}{m}]}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}( \theta _{0})\stackrel{p}{\longrightarrow}\sigma_{kk}.$$
(4.6)
Using (4.5) and (4.6), we complete the proof of (4.4).
Next we prove that
$$\lim_{n\rightarrow\infty}P \bigl(B^{k}_{n, m} \bigr)=1,\quad m=1, 2,\ldots.$$
(4.7)
After some calculation, we obtain
\begin{aligned} P \bigl(B^{k}_{n, m} \bigr) =&P \Biggl( \bigcap _{j=1}^{m}B^{k,j}_{n, m} \Biggr) \\ =&P \Biggl(\bigcap_{j=1}^{m-1}B^{k,j}_{n, m} \Biggr)+P \bigl(B^{k,m}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-1}B^{k,j}_{n, m} \Biggr)\cup B^{k,m}_{n, m} \Biggr) \\ =&\cdots \\ =&P \bigl(B^{k,1}_{n, m} \bigr)+P \bigl(B^{k,2}_{n, m} \bigr) -P \bigl(B^{k,1}_{n, m}\cup P \bigl(B^{k,2}_{n, m} \bigr) \bigr)+\cdots \\ &{}+P \bigl(B^{k,{m-1}}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-2}B^{k,j}_{n, m} \Biggr)\cup B^{k,{m-1}}_{n, m} \Biggr) \\ &{}+P \bigl(B^{k,{m}}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-1}B^{k,j}_{n, m} \Biggr) \cup B^{k,{m}}_{n, m} \Biggr). \end{aligned}
(4.8)
For all $$2\leq i\leq m$$, (4.4) implies that
$$\lim_{n\rightarrow\infty}P \Biggl( \Biggl(\bigcap _{j=1}^{i-1}B^{k,j}_{n, m} \Biggr)\cup B^{k,{i}}_{n, m} \Biggr)=1.$$
Thus, again by (4.4), (4.7) can be proved.

Finally we prove (4.2).

Note that
$$\frac{1}{n}\max_{1\leq t\leq n} g^{2}_{tk}( \theta_{0})\leq\frac{1}{n}\sup_{s\in[0,1]}\sum _{t=[ns]+1}^{[n(s+\frac{1}{m})]}g^{2}_{tk}( \theta_{0}).$$
For given $$s\in[0,1]$$ choose $$j\in\{1,\ldots,m\}$$ so that $$s\in[\frac{j-1}{m},\frac{j}{m}]$$. Then, for each $$s\in[0,1]$$, $$\omega\in B^{k}_{n, m}$$ implies
\begin{aligned} \frac{1}{n}\sum_{t=[ns]+1}^{[n(s+\frac{1}{m})]}g^{2}_{tk}( \theta _{0}) \leq&\frac{1}{n}\sum_{t=[n(j-1)/m]+1}^{[n(j+1)/m]}g^{2}_{tk}( \theta_{0}) \\ =& \Biggl(\frac{1}{n}\sum_{t=1}^{[n(j+1)/m]}g^{2}_{tk}( \theta _{0})-\frac{j+1}{m}\sigma_{kk} \Biggr) \\ &{}- \Biggl(\frac{1}{n}\sum_{t=1}^{[n(j-1)/m]}g^{2}_{tk}( \theta _{0})-\frac{j-1}{m}\sigma_{kk} \Biggr)+ \frac{2}{m}\sigma_{kk} \\ \leq&\frac{2}{m}+\frac{2}{m}\sigma_{kk} \\ =&(1+\sigma_{kk})\frac{2}{m}. \end{aligned}
(4.9)
Therefore, for all $$m\geq1$$,
$$\lim_{n\rightarrow\infty} P \biggl\{ \frac{1}{n}\max _{1\leq t\leq n} g^{2}_{tk}(\theta_{0})\leq \frac{2}{m}(1+\sigma_{kk}) \biggr\} \geq \lim_{n\rightarrow\infty}P \bigl(B^{k}_{n, m} \bigr)=1,$$
showing (4.2). The proof of Lemma 4.1 is thus completed. □

### Lemma 4.2

Assume that (A1) and (A2) hold. Then
$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n} \bigl( Z^{\tau}_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr), g_{t}^{\tau}( \theta_{0}) \bigr)^{\tau}\stackrel{d}{\longrightarrow}N(0, M),$$
where
$$M=\left ( \textstyle\begin{array}{@{}c@{\quad}c@{}} \Lambda& \Lambda_{12}\\ \Lambda^{\tau}_{12} & \Sigma \end{array}\displaystyle \right ).$$
(4.10)

### Proof

By the Cramer-Wold device, it suffices to show that, for all $$c\in R^{(p+1+r)}\setminus(0,\ldots,0)$$,
$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n}c^{\tau}\bigl( Z^{\tau}_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr), g_{t}^{\tau}( \theta_{0}) \bigr)^{\tau}\stackrel {d}{\longrightarrow}N \bigl(0,c^{\tau}Mc \bigr).$$
For simplicity, we write $$c^{\tau}( Z^{\tau}_{t}(X_{t}-Z^{\tau}_{t}\alpha^{0}), g_{t}^{\tau}(\theta_{0}) )^{\tau}$$ for $$G_{t,c}(\theta_{0})$$. Further, let $$\xi_{nt}=\frac{1}{\sqrt {n}}G_{t,c}(\theta_{0})$$ and $$\mathscr{F}_{nt}=\sigma$$ ($$\xi_{nr}$$, $$1\leq r \leq t$$). Then $$\{\sum_{t=1}^{n}\xi_{nt},\mathscr{F}_{nt}, 1\leq t\leq n, n\geq1\}$$ is a zero-mean, square integrable martingale array. By making use of a martingale central limit theorem , it suffices to show that
\begin{aligned}& \max_{1\leq t\leq n}|\xi_{nt}|\stackrel{p}{ \longrightarrow}0, \end{aligned}
(4.11)
\begin{aligned}& \sum_{t=1}^{n} \xi^{2}_{nt} \stackrel{p}{\longrightarrow}c^{\tau}Mc, \end{aligned}
(4.12)
\begin{aligned}& E \Bigl(\max_{1\leq t\leq n}\xi^{2}_{nt} \Bigr) \mbox{ is bounded in } n , \end{aligned}
(4.13)
and the fields are nested:
$$\mathscr{F}_{nt}\subseteq\mathscr{F}_{(n+1)t} \quad\mbox{for } 1\leq t\leq n, n\geq1.$$
(4.14)
Note that (4.14) is obvious. In what follows, we first consider (4.11). By a simple calculation, we have, for all $$\varepsilon>0$$,
\begin{aligned} &P \Bigl\{ \max_{1\leq t\leq n}| \xi_{nt}|> \varepsilon \Bigr\} \\ &\quad\leq\sum_{t=1}^{n}P\bigl\{ | \xi_{nt}|>\varepsilon\bigr\} \\ &\quad= \sum_{t=1}^{n}P \biggl\{ \biggl| \frac{1}{\sqrt{n}}G_{t,c}(\theta _{0})\biggr|>\varepsilon \biggr\} \\ &\quad=n P \bigl\{ \bigl|G_{t,c}(\theta_{0})\bigr|>\sqrt{n} \varepsilon \bigr\} \\ &\quad=n\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt{n}\varepsilon \bigr)\, d P \\ &\quad\leq n\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt{n}\varepsilon \bigr)\frac{(G_{t,c}(\theta_{0}))^{2}}{ (\sqrt{n\varepsilon})^{2}}\,d P \\ &\quad=\frac{1}{\varepsilon^{2}}\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt {n}\varepsilon \bigr) \bigl(G_{t,c}( \theta_{0}) \bigr)^{2}\,d P. \end{aligned}
(4.15)
Now by the Lebesgue control convergence theorem, we immediately find that (4.15) converges to 0 as $$n\rightarrow\infty$$. This settles (4.11).
Next consider (4.12). By the ergodic theorem, we have
\begin{aligned} \sum_{t=1}^{n} \xi^{2}_{nt} =& \sum_{t=1}^{n} \biggl( \frac{1}{\sqrt{n}}G_{t,c}(\theta_{0}) \biggr)^{2} \\ \stackrel{\mathrm{a.s.}}{\longrightarrow}&E \bigl(G_{t,c}( \theta_{0}) \bigr)^{2} \\ =&c^{\tau}M c. \end{aligned}
Hence (4.12) is proved.
Finally, consider (4.13). Note that $$\{( \frac{1}{\sqrt{n}}G_{t,c}(\theta_{0}) )^{2},t\geq1\}$$ is a stationary sequence. Then we have
\begin{aligned} E \Bigl(\max_{1\leq t\leq n}\xi^{2}_{nt} \Bigr) =&E \biggl(\max_{1\leq t\leq n} \biggl( \frac{1}{\sqrt{n}}G_{t,c}( \theta_{0}) \biggr)^{2} \biggr) \\ \leq&\frac{1}{n}E \Biggl( \sum_{t=1}^{n} \bigl(G_{t,c}(\theta _{0}) \bigr)^{2} \Biggr) \\ =&\frac{1}{n}\sum_{t=1}^{n}E \bigl(G_{t,c}(\theta_{0}) \bigr)^{2} \\ =&c^{\tau}Mc. \end{aligned}
(4.16)
This proves that (4.13). Thus, we complete the proof of Lemma 4.2. □

### Lemma 4.3

Assume that (A1) and (A2) hold. Then
$$\lambda_{\theta_{0}}=O_{p} \bigl(n^{-\frac{1}{2}} \bigr).$$

### Proof

Let $$\lambda_{\theta_{0}}=\zeta\beta_{0}$$, where $$\|\beta_{0}\|=1$$ is a unit vector and $$\zeta=\|\lambda_{\theta_{0}}\|$$. Then (2.3) implies that
\begin{aligned} 0 =&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} \frac{g_{t}(\theta _{0})}{1+\zeta\beta^{\tau}_{0}g_{t}(\theta_{0})} \\ =&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n}g_{t}( \theta_{0})-\frac {\zeta}{n}\sum_{t=1}^{n} \frac{(\beta^{\tau}_{0}g_{t}(\theta _{0}))^{2}}{1+\zeta\beta^{\tau}_{0}g_{t}(\theta_{0})} \\ \leq&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n}g_{t}( \theta _{0})-\frac{\zeta}{1+\zeta\max_{1\leq t \leq n}\|g_{t}(\theta_{0})\| }\beta^{\tau}_{0} \Sigma_{n}(\theta_{0})\beta_{0}. \end{aligned}
This implies that
\begin{aligned} \zeta\beta^{\tau}_{0}\Sigma_{n}( \theta)\beta_{0}- \max_{1\leq t \leq n}\bigl\| g_{t}( \theta_{0})\bigr\| \frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0}) \leq\frac{\beta_{0}^{\tau}}{n}\sum _{t=1}^{n} g_{t}^{\tau}( \theta_{0}). \end{aligned}
(4.17)
Note that
\begin{aligned} \Biggl|\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0})\Biggr|\leq\Biggl\| \frac{1}{n} \sum_{t=1}^{n} g_{t}^{\tau}( \theta_{0})\Biggr\| =O_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}
(4.18)
By using Lemma 4.1 and (4.18), we can obtain
\begin{aligned} \max_{1\leq t \leq n}\bigl\| g_{t}( \theta_{0})\bigr\| \frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0})=o_{p}(1). \end{aligned}
(4.19)
By (4.3), we have
\begin{aligned} \beta_{0}^{\tau}\Sigma_{n}( \theta_{0})\beta_{0}\stackrel{p}{\longrightarrow}\beta _{0}^{\tau}\Sigma(\theta_{0})\beta_{0}. \end{aligned}
(4.20)
By this, together with (4.17)-(4.19), we can prove Lemma 4.3. □

### Proof of Theorem 2.1

By (2.3), we have
\begin{aligned} \lambda_{\theta_{0}}= \bigl(\Sigma_{n}( \theta_{0}) \bigr)^{-1}\frac{1}{n}\sum _{t=1}^{n} g_{t}(\theta_{0})+ \bigl(\Sigma_{n}(\theta_{0}) \bigr)^{-1}R_{n}( \theta_{0}), \end{aligned}
(4.21)
where
$$R_{n}(\theta_{0})=\frac{1}{n}\sum _{t=1}^{n} g_{t}^{\tau}( \theta_{0})\frac {(\lambda^{\tau}_{\theta_{0}}g_{t}(\theta_{0}))^{2}}{1+\lambda^{\tau}_{\theta _{0}}g_{t}(\theta_{0})}.$$
By Lemmas 4.1-4.3, we know that
\begin{aligned} R_{n}(\theta_{0})=o_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}
(4.22)
This implies uniformly for t
\begin{aligned} \omega_{t}(\theta_{0}) =&\frac{1}{n} \frac{1}{1+\lambda_{\theta _{0}}^{\tau}g_{t}(\theta_{0})} \\ =&\frac{1}{n} \bigl( 1- \lambda_{\theta_{0}}^{\tau}g_{t}(\theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr). \end{aligned}
(4.23)
Moreover, note that
$$\sqrt{n} \bigl(\hat{\alpha}-\alpha^{0} \bigr)=T_{n}^{-1}S_{n},$$
(4.24)
where $$T_{n}=\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}Z^{\tau}_{t}$$ and $$S_{n}=\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}(X_{t}-Z^{\tau}_{T}\alpha^{0})$$.
First, we consider $$T_{n}$$. By (4.23), we have
\begin{aligned} T_{n} =&\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}Z^{\tau}_{t} \\ =&\frac{1}{n}\sum_{t=1}^{n} \bigl( 1- \lambda_{\theta_{0}}^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr)Z_{t}Z^{\tau}_{t} \\ =&\frac{1}{n}\sum_{t=1}^{n}Z_{t}Z^{\tau}_{t} -\frac{1}{n}\sum_{t=1}^{n} \bigl(Z_{t}Z^{\tau}_{t} \bigr)\otimes \bigl( \bigl( \bigl(\Sigma_{n}(\theta _{0}) \bigr)^{-1}R_{n}( \theta_{0}) \bigr)^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr) \\ &{}-\frac{1}{n}\sum_{t=1}^{n} \bigl(Z_{t}Z^{\tau}_{t} \bigr)\otimes \Biggl( \Biggl( \bigl(\Sigma _{n}(\theta_{0}) \bigr)^{-1} \frac{1}{n}\sum_{t=1}^{n} g_{t}(\theta_{0}) \Biggr)^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \Biggr) \\ \doteq&U_{1}-U_{2}-U_{3}. \end{aligned}
Then by (4.3) and (4.22), conditions (A1) and (A2), Lemma 4.2, and the ergodic theorem, we can prove that
$$U_{2}=o_{p}(1).$$
(4.25)
Similarly, we can obtain
$$U_{3}\stackrel{\mathrm{a.s.}}{\longrightarrow}0.$$
(4.26)
This, together with (4.25), yields
$$T_{n}\stackrel{p}{\longrightarrow}W.$$
(4.27)
Next consider $$S_{n}$$. By (4.23), we have
\begin{aligned} S_{n} =&\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\theta _{0}})Z_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr) \\ =&\frac{1}{\sqrt{n}}\sum_{t=1}^{n} \bigl( 1- \lambda_{\theta _{0}}^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr)Z_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr). \end{aligned}
Thus, combining with (4.21), (4.22), we can obtain
\begin{aligned} S_{n} =&\frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr) \\ &{}-\frac{1}{n}\sum_{t=1}^{n} \bigl(X_{t}-Z^{\tau}_{t}\alpha ^{0} \bigr)Z_{t}g_{t}^{\tau}(\theta_{0}) \bigl( \Sigma_{n}(\theta_{0}) \bigr)^{-1} \frac{1}{\sqrt{n}}\sum_{t=1}^{n} g_{t}(\theta_{0})+o_{p}(1). \end{aligned}
By the ergodic theorem, we have
$$\frac{1}{n}\sum_{t=1}^{n} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr)Z_{t}g_{t}^{\tau}(\theta _{0}) \stackrel{\mathrm{a.s.}}{\longrightarrow} \Lambda_{12}.$$
This, together with (4.3) and Lemma 4.2, proves that
\begin{aligned} S_{n}\stackrel{d}{\longrightarrow}N \bigl(0,\Lambda- \Lambda_{12}\Sigma ^{-1}\Lambda^{\tau}_{12} \bigr), \end{aligned}
(4.28)
which, combining with (4.27), proves Theorem 2.1. □

### Proof of Theorem 2.2

Similar to the proof of Lemma 1 and Theorem 1 in Qin and Lawless , we can prove that
\begin{aligned} \lambda_{\hat{\theta}}=B\frac{1}{n}\sum _{t=1}^{n}g_{t}(\theta _{0})+o_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}
(4.29)
Moreover, note that
$$\sqrt{n} \bigl(\hat{\alpha}^{1}-\alpha^{0} \bigr)=\tilde{T}_{n}^{-1}\tilde{S}_{n},$$
(4.30)
where
$$\tilde{T}_{n}=\sum_{t=1}^{n} \omega_{t}({\hat{\theta}})Z_{t}Z^{\tau}_{t}$$
and
$$\tilde{S}_{n}=\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\hat{\theta }})Z_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr).$$
By an argument similar to the proof of Theorem 2.1, we can prove that
$$\tilde{T}_{n}\stackrel{p}{\longrightarrow}W$$
(4.31)
and
$$\tilde{S}_{n}\stackrel{d}{\longrightarrow}N \bigl(0, \Lambda-\Lambda_{12}B\Lambda ^{\tau}_{12} \bigr),$$
(4.32)
showing (2.7). The proof of Theorem 2.2 is thus completed. □

## Declarations

### Acknowledgements

This work is supported by National Natural Science Foundation of China (Nos. 11271155, 11001105, 11071126, 10926156, 11071269), Specialized Research Fund for the Doctoral Program of Higher Education (Nos. 20110061110003, 20090061120037), Scientific Research Fund of Jilin University (Nos. 201100011, 200903278), the Science and Technology Development Program of Jilin Province (201201082), Jilin Province Social Science Fund (2012B115), and Jilin Province Natural Science Foundation (20101596, 20130101066JC).

## Authors’ Affiliations

(1)
College of Mathematics, Jilin University, Changchun, 130012, China
(2)
Public Foreign Languages Department, Jilin Normal University, Siping, 136000, China
(3)
College of Mathematics, Jilin Normal University, Siping, 136000, China

## References 