Skip to main content

Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions


In this paper, we employ the empirical likelihood method to estimate the unknown parameters in Poisson autoregressive model in the presence of auxiliary information. It is shown that our approach proposed, compared to the maximum likelihood estimator, the least squares estimator, and the weighted least squares estimator, yields more efficient estimators. Some simulation studies are also conducted to investigate the finite sample performance of the proposed method.


Integer-valued time series occur in many different situations. They arise, for example, as the number of births at a hospital in successive months, the number of bases of DNA sequences, the number of road accidents and the number of diseases in a certain area in successive months. Therefore, in recent years, there has been growing interest in studying integer-valued time series [14]. In order to model the number of cases of campylobacterosis infections from January 1990 to the end of October 2000 in the north of the Province of Québec, Ferland et al. [5] proposed the following Poisson autoregressive model:

$$ \left \{ \textstyle\begin{array}{@{}l} X_{t}|\mathscr{F}_{t-1}: \mathcal{P}(\lambda_{t}); \quad \forall t\in\mathbb{Z},\\ \lambda_{t}=\alpha_{0}+\sum_{i=1}^{p}\alpha_{i}X_{t-i}, \end{array}\displaystyle \right . $$

where \(\mathscr{F}_{t-1}\) is the σ field generated by \((X_{t-1},X_{t-2},\ldots)\), \(\alpha_{0}>0\), \(\alpha_{i}\geq0\) (\(i=1, 2, \ldots , p\)), and \(\alpha=(\alpha_{0}, \alpha_{1}, \ldots, \alpha_{p})\) is an unknown parameter vector.

For model (1.1), Zhu and Wang [6] gave the condition for ergodicity and a necessary and sufficient condition for the existence of moments. They also established the asymptotics for the maximum likelihood estimator and the least squares estimators. The problem of interest here is to estimate the unknown parameter in model (1.1) by using the empirical likelihood method when auxiliary information is available. In practice, some auxiliary information can often be obtained, such as that the unknown distribution is symmetric or the variance of population is a function of the mean. By making full use of the auxiliary information, we can increase the precision of statistical inference. Here, we assume that we have some auxiliary information that can be represented as the following conditional moment restrictions:

$$ E \bigl(g(X_{t},\ldots,X_{t-p}; \theta_{0})|X(t-1) \bigr)=0 $$

for each \(t=0, 1, 2,\ldots\) , where the unknown parameter vector \(\theta _{0}\in R^{d}\), \(X(t-1)=(X_{t-1}, \ldots, X_{t-p})\) and \(g(x;\theta)\in R^{r}\) is some function with \(r\geq d\). In order to simplify the notation, we further denote \(g(X_{t},\ldots ,X_{t-p};\theta) \) by \(g_{t}(\theta)\). We note here that θ can be different from α and the notion \(g_{t}(\theta)\) contains a broad class of information that can be formulated from the knowledge on the probability distribution of \(X_{t}\), e.g. the moment and their generalizations [7]. By using the conditional moment restrictions, as we expect, we can increase the efficiency of the resulting estimator [810].

The EL as an alternative to the bootstrap for constructing confidence regions was introduced by Owen [11, 12]. The method defines an EL ratio function to construct confidence regions. Important features of the empirical likelihood method are its automatic determination of the shape and orientation of the confidence region by the data. These attractive properties have motivated various authors to extend the empirical likelihood methodology to other situations. To use the auxiliary information, some statisticians have also developed some statistical inference methods under the framework of empirical likelihood method [1315]. In this paper, we further generalize these methods to the statistical inference of time series models. Specifically, based on the empirical likelihood method, we consider the parameter estimation problem for Poisson autoregressive model with conditional moment restrictions. Our approach yields more efficient estimates compared to the maximum likelihood estimator, the least squares estimator and the weighted least squares estimator, which do not utilize the conditional moment restrictions. Based on the mean square errors, a comparison is also made by simulation. Our simulation indicates that the use of auxiliary information provides improved inferences.

The rest of this paper is organized as follows. In Section 2, we introduce the methodology and the main results. Simulation results are reported in Section 3. Section 4 provides the proofs of the main results.

The symbols ‘\(\stackrel{d}{\longrightarrow}\)’ and ‘\(\stackrel{p}{\longrightarrow}\)’ denote convergence in distribution and convergence in probability, respectively. Convergence ‘almost surely’ is written as ‘a.s.’ . Furthermore, ‘\(M^{\tau}_{k\times p}\)’ denotes the transpose matrix of the \(k\times p\) matrix \(M_{k\times p}\), \(A\otimes B\) denotes the Kronecker product of matrices A and B, and \(\|\cdot\|\) denotes Euclidean norm of the matrix or vector.

Methodology and main results

In this section, we will first discuss how to apply the empirical likelihood method [16, 17] to estimate the unknown parameter α when auxiliary information is available.

Before we state our main results, the following assumptions will be made:


The parametric space ϒ is compact with \(\Upsilon=\{\alpha: \delta\leq\alpha_{0}\leq M, 0\leq \alpha_{1}+\cdots +\alpha_{p}\leq M^{\ast}<1, \alpha_{i}\geq0, i=1, 2, \ldots, p \}\), where δ and M are finite positive constants, and the true parameter value \(\alpha^{0}\) is an interior point in ϒ.

Remark 1

It is shown by Corollary 1 in Ferland et al. [5] and Theorem 1 and Theorem 2 in Zhu and Wang [6] that, under (A1), \(\{X_{t}, t\geq1\}\) is stationary and ergodic, and \(E(X_{t}^{m})<\infty\) for any fixed positive integer m.


There exists \(\theta_{0}\) such that \(E(g_{t}(\theta_{0}))=0\), the matrix \(\Sigma(\theta)=E(g_{t}(\theta) g_{t}^{\tau}(\theta))\) is positive definite at \(\theta_{0}\), \(\partial g(x;\theta) /\partial\theta\) is continuous in a neighborhood of the true value \(\theta_{0}\), \(\| \partial g(x;\theta) /\partial\theta\|\) and \(\|g(x;\theta)\|^{3}\) are bounded by some integrable function \(\tilde{W}(x)\) in this neighborhood, and the rank of \(E(\partial g_{t}(\theta) /\partial\theta)\) is d.

First, the conditional moment restrictions in (1.2) imply that \(E(g_{t}(\theta_{0}))=0\). Further, by using the empirical likelihood method, we can obtain data adaptive weights \(\omega_{t}\) through

$$\begin{aligned} L(\theta) =\sup \Biggl\{ \prod_{t=1}^{n} \omega_{t}: \omega_{t}\geq0, \sum _{i=1}^{n}\omega_{t}=1, \sum _{t=1}^{n}\omega_{t}g_{t}(\theta _{0})=0 \Biggr\} , \end{aligned}$$

where \(\theta_{0}\) is an unknown parameter. By using the auxiliary information combining with the least squares method, we propose to estimate α by

$$ \hat{\alpha}=\arg\min_{\alpha}\sum _{t=1}^{n}\omega _{t} \bigl(X_{t}-Z_{t}^{\tau}\alpha \bigr)^{2}, $$

where \(Z_{t}^{\tau}=(1, X_{t-1}, \ldots, X_{t-p})\). By introducing a Lagrange multiplier \(\lambda\in R^{r}\), standard derivations in the empirical likelihood lead to

$$ \omega_{t}(\theta_{0})=\frac{1}{n} \frac{1}{1+\lambda_{\theta_{0}}^{\tau}g_{t}(\theta_{0})}, $$

where \(\lambda_{\theta_{0}}\) satisfies

$$ \frac{1}{n}{}\sum_{t=1}^{n} \frac{g_{t}(\theta_{0})}{1+\lambda_{\theta _{0}}^{\tau}g_{t}(\theta_{0})}=0. $$

Utilizing the weights given by (2.2), we obtain the estimate of α:

$$ \hat{\alpha}= \Biggl(\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t} Z_{t}^{\tau}\Biggr)^{-1} \sum _{t=1}^{n} \omega_{t}({ \theta_{0}})X_{t}Z_{t}. $$

In the following, we will give the asymptotic properties of \(\hat{\alpha}\).

Theorem 2.1

Assume that (A1) and (A2) hold. If \(\alpha^{0}\) is the true value of α, then

$$\begin{aligned} \sqrt{n} \bigl(\hat{\alpha}-\alpha^{0} \bigr)\stackrel{d}{ \longrightarrow }N \bigl(0, W^{-1} \bigl(\Lambda-\Lambda_{12} \Sigma^{-1}(\theta_{0})\Lambda^{\tau}_{12} \bigr)W^{-1} \bigr), \end{aligned}$$

where \(W=E(Z_{t} Z_{t}^{\tau})\), \(\Lambda=E(Z_{t} Z_{t}^{\tau}(X_{t}-Z_{t}^{\tau}\alpha ^{0})^{2})\), and \(\Lambda_{12}=E(Z_{t}g_{t}^{\tau}(\theta_{0})(X_{t}-Z_{t}^{\tau}\alpha^{0}))\).

Zhu and Wang [6] prove that the asymptotic variance of the ordinary least squares estimator \(\bar{\alpha}=(\sum_{t=1}^{n}Z_{t} Z_{t}^{\tau})^{-1}\sum_{t=1}^{n} X_{t}Z_{t} \) is \(W^{-1}\Lambda W^{-1}\). Note that Λ and \(\Sigma^{-1}(\theta_{0})\) are both positive definite matrices. Therefore, by Theorem 2.1, we find that a variance reduction quantified by \(W^{-1}\Lambda_{12}\Sigma ^{-1}(\theta_{0})\Lambda^{\tau}_{12}W^{-1}\) is induced by incorporating the auxiliary information \(E(g_{t}(\theta_{0})|X(t-1))=0\).

To apply the proposed estimator (2.4), we need to further estimate the unknown parameter θ. We consider \(\hat{\theta}=\arg \max_{\theta}L(\theta)\). Following the results in Qin and Lawless [16], the corresponding weights \(\{\omega_{t}\}_{t=1}^{n}\) satisfy

$$ \omega_{t}(\hat{\theta})=\frac{1}{n} \frac{1}{1+\lambda_{\hat{\theta }}^{\tau}g_{t}(\hat{\theta})}, $$

where \(\lambda_{\hat{\theta}}\) is the solution to

$$\frac{1}{n}\sum_{t=1}^{n} \frac{g_{t}(\hat{\theta})}{1+\lambda_{\hat {\theta}}^{\tau}g_{t}(\hat{\theta})}=0 $$

and \((\lambda_{\hat{\theta}},\hat{\theta})\) solves

$$\frac{1}{n}\sum_{t=1}^{n} \frac{\frac{\partial g_{t}( {\theta })}{\partial\theta^{\tau}}\lambda_{\hat{\theta}}}{1+\lambda_{\hat{\theta }}^{\tau}g_{t}(\hat{\theta})}=0. $$


$$\begin{aligned} \hat{\alpha}^{1}=\arg\min_{\alpha}\sum _{t=1}^{n}\omega_{t}(\hat { \theta}) \bigl(X_{t}-Z_{t}^{\tau}\alpha \bigr)^{2}, \end{aligned}$$

where \(\{\omega_{t}(\hat{\theta})\}_{t=1}^{n}\) is identified by (2.5). When \(r=d\), we know that \(\omega_{t}(\hat{\theta})=\frac{1}{n}\) and hence \(\hat{\alpha}^{1}\) is the ordinary least squares estimator. When \(r>d\), \(\omega_{t}(\hat{\theta})\) is no longer equal to \(\frac{1}{n}\) and we shall show that this scheme provides an efficiency gain over the conventional least squares estimator.

In order to study the estimator (2.6), we define \(\Gamma(\theta _{0})=E(\frac{\partial g_{t}(\theta_{0})}{\partial\theta})\), \(\Omega(\theta _{0})=(\Gamma(\theta_{0})\Sigma^{-1}(\theta_{0}) \Gamma^{\tau}(\theta_{0}) )^{-1}\) and \(B=\Sigma^{-1}(\theta_{0})(I-\Gamma(\theta_{0})\Omega(\theta_{0})\Gamma ^{\tau}(\theta_{0})\Sigma^{-1}(\theta_{0}) )\), where I is the identity matrix. The limiting distribution of \(\hat{\alpha}^{1}\) is given in the following theorem.

Theorem 2.2

Assume that (A1) and (A2) hold. If \(\alpha^{0}\) is the true value of α, then

$$\begin{aligned} \sqrt{n} \bigl(\hat{\alpha}^{1}-\alpha^{0} \bigr)\stackrel{d}{\longrightarrow}N \bigl(0, W^{-1} \bigl(\Lambda- \Lambda_{12}B\Lambda^{\tau}_{12} \bigr)W^{-1} \bigr). \end{aligned}$$

The matrix B is non-negative definite. Hence the asymptotic variance of \(\hat{\alpha}^{1}\) is no greater than that of the least squares estimator. When B is positive definite, variance reduction is attained. This implies that having much more auxiliary information can improve the least squares estimator.

Simulation study

In this section we conduct some simulation studies which show that our proposed methods perform very well. Consider the following one order Poisson autoregressive model:

$$ \left \{ \textstyle\begin{array}{@{}l} X_{t}|\mathscr{F}_{t-1}: \mathcal{P}(\lambda_{t}); \quad \forall t\in\mathbb{Z},\\ \lambda_{t}=\alpha_{0}+ \alpha_{1}X_{t-1}. \end{array}\displaystyle \right . $$

In order to compare the performance of the estimator (denoted by ALS) given by (2.6) with those of the ordinary least squares estimator (LS), the weighted least squares estimator (WLS), and the maximum likelihood estimation (MLE), we compute the mean square errors based on the four methods: the ALS, the LS, the WLS, and the MLE. We use the vector \(g_{t}(\alpha_{0}, \alpha_{1})=(1, X_{t-1}, X^{2}_{t-1})^{\tau}(X_{t}-\alpha_{0}-\alpha_{1} X_{t-1})\) as the conditional moment restrictions in (1.2). Specifically, for a particular pair of \((\alpha_{0}, \alpha_{1})^{\tau}\), we generate realizations from (3.1) with \(n=100, 300\mbox{ and }1{,}000\). Further, based on 1,000 repetitions, we compute the mean square errors of the above four kinds of estimators. The simulation results for \(\alpha_{0}=1\) are summarized in Table 1. Table 2 presents the simulation results for \(\alpha_{0}=2\).

Table 1 Simulation results when \(\pmb{\alpha_{0}=1}\)
Table 2 Simulation results when \(\pmb{\alpha_{0}=2}\)

From Tables 1 and 2, we see that the mean square errors obtained by the estimator (2.6) are less than those of the maximum likelihood estimator, the least squares estimator and the weighted least squares estimator. This indicates that using the conditional moment restrictions, the estimates are more accurate, regardless of the samples size and the different unknown parameter.

Proofs of the main results

In order to prove Theorem 2.1, we first present several lemmas.

Lemma 4.1

Assume that (A1) and (A2) hold. Then

$$ \max_{1\leq t\leq n} \bigl\| g_{t}( \theta_{0})\bigr\| =o_{p} \bigl(n^{\frac{1}{2}} \bigr). $$


Let \(\Sigma_{n}(\theta_{0})=\frac{1}{n}\sum_{t=1}^{n}(g_{t}(\theta_{0})g_{t}^{\tau}(\theta_{0}))\) and \(g_{t}(\theta_{0})=( g_{t1}(\theta_{0}),\ldots, g_{tr}(\theta_{0}))^{\tau}\). Note that in order to prove (4.1), we need only to prove that

$$ \frac{1}{n}\max_{1\leq t\leq n} g_{tk}( \theta_{0})\stackrel{p}{\longrightarrow}0,\quad k=1,\ldots,r. $$

We denote the \((h,l)\)th element of \(\Sigma(\theta_{0})\) as \(\sigma_{hl}\), \(h,l=1,2,\ldots,r\). By the ergodic theorem, we have

$$ \Sigma_{n}(\theta_{0})\stackrel{p}{ \longrightarrow}\Sigma(\theta_{0}). $$

For all \(1\leq k\leq r\) and \(1\leq m\leq n\), define the sets

$$B^{k}_{n, m}=\bigcap_{j=1}^{m}B^{k,j}_{n, m}, $$

where \(B^{k,j}_{n, m}=\{\omega:|\frac{1}{n}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}(\theta _{0})-\frac{j}{m}\sigma_{kk} |\leq\frac{1}{m}\}\), \(j=1,\ldots,m\).

First we show that

$$ \lim_{n\rightarrow\infty}P \bigl(B^{k,j}_{n, m} \bigr)=1,\quad j=1, \ldots, m. $$

After some algebra, we obtain

$$B^{k,j}_{n,m}= \Biggl\{ \omega:\Biggl|\frac{[n\frac{j}{m}]}{n} \frac{m}{j}\frac {1}{[n\frac{j}{m}]}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}( \theta_{0})-\sigma_{kk} \Biggr|\leq\frac{1}{m} \frac{m}{j} \Biggr\} . $$

Observe that

$$ \frac{[n\frac{j}{m}]}{n}\frac{m}{j}\rightarrow1,\quad n \rightarrow \infty. $$

Furthermore, by (4.3), we have

$$ \frac{1}{[n\frac{j}{m}]}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}( \theta _{0})\stackrel{p}{\longrightarrow}\sigma_{kk}. $$

Using (4.5) and (4.6), we complete the proof of (4.4).

Next we prove that

$$ \lim_{n\rightarrow\infty}P \bigl(B^{k}_{n, m} \bigr)=1,\quad m=1, 2,\ldots. $$

After some calculation, we obtain

$$\begin{aligned} P \bigl(B^{k}_{n, m} \bigr) =&P \Biggl( \bigcap _{j=1}^{m}B^{k,j}_{n, m} \Biggr) \\ =&P \Biggl(\bigcap_{j=1}^{m-1}B^{k,j}_{n, m} \Biggr)+P \bigl(B^{k,m}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-1}B^{k,j}_{n, m} \Biggr)\cup B^{k,m}_{n, m} \Biggr) \\ =&\cdots \\ =&P \bigl(B^{k,1}_{n, m} \bigr)+P \bigl(B^{k,2}_{n, m} \bigr) -P \bigl(B^{k,1}_{n, m}\cup P \bigl(B^{k,2}_{n, m} \bigr) \bigr)+\cdots \\ &{}+P \bigl(B^{k,{m-1}}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-2}B^{k,j}_{n, m} \Biggr)\cup B^{k,{m-1}}_{n, m} \Biggr) \\ &{}+P \bigl(B^{k,{m}}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-1}B^{k,j}_{n, m} \Biggr) \cup B^{k,{m}}_{n, m} \Biggr). \end{aligned}$$

For all \(2\leq i\leq m\), (4.4) implies that

$$\lim_{n\rightarrow\infty}P \Biggl( \Biggl(\bigcap _{j=1}^{i-1}B^{k,j}_{n, m} \Biggr)\cup B^{k,{i}}_{n, m} \Biggr)=1. $$

Thus, again by (4.4), (4.7) can be proved.

Finally we prove (4.2).

Note that

$$\frac{1}{n}\max_{1\leq t\leq n} g^{2}_{tk}( \theta_{0})\leq\frac{1}{n}\sup_{s\in[0,1]}\sum _{t=[ns]+1}^{[n(s+\frac{1}{m})]}g^{2}_{tk}( \theta_{0}). $$

For given \(s\in[0,1]\) choose \(j\in\{1,\ldots,m\}\) so that \(s\in[\frac{j-1}{m},\frac{j}{m}]\). Then, for each \(s\in[0,1]\), \(\omega\in B^{k}_{n, m}\) implies

$$\begin{aligned} \frac{1}{n}\sum_{t=[ns]+1}^{[n(s+\frac{1}{m})]}g^{2}_{tk}( \theta _{0}) \leq&\frac{1}{n}\sum_{t=[n(j-1)/m]+1}^{[n(j+1)/m]}g^{2}_{tk}( \theta_{0}) \\ =& \Biggl(\frac{1}{n}\sum_{t=1}^{[n(j+1)/m]}g^{2}_{tk}( \theta _{0})-\frac{j+1}{m}\sigma_{kk} \Biggr) \\ &{}- \Biggl(\frac{1}{n}\sum_{t=1}^{[n(j-1)/m]}g^{2}_{tk}( \theta _{0})-\frac{j-1}{m}\sigma_{kk} \Biggr)+ \frac{2}{m}\sigma_{kk} \\ \leq&\frac{2}{m}+\frac{2}{m}\sigma_{kk} \\ =&(1+\sigma_{kk})\frac{2}{m}. \end{aligned}$$

Therefore, for all \(m\geq1\),

$$\lim_{n\rightarrow\infty} P \biggl\{ \frac{1}{n}\max _{1\leq t\leq n} g^{2}_{tk}(\theta_{0})\leq \frac{2}{m}(1+\sigma_{kk}) \biggr\} \geq \lim_{n\rightarrow\infty}P \bigl(B^{k}_{n, m} \bigr)=1, $$

showing (4.2). The proof of Lemma 4.1 is thus completed. □

Lemma 4.2

Assume that (A1) and (A2) hold. Then

$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n} \bigl( Z^{\tau}_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr), g_{t}^{\tau}( \theta_{0}) \bigr)^{\tau}\stackrel{d}{\longrightarrow}N(0, M), $$


$$ M=\left ( \textstyle\begin{array}{@{}c@{\quad}c@{}} \Lambda& \Lambda_{12}\\ \Lambda^{\tau}_{12} & \Sigma \end{array}\displaystyle \right ). $$


By the Cramer-Wold device, it suffices to show that, for all \(c\in R^{(p+1+r)}\setminus(0,\ldots,0)\),

$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n}c^{\tau}\bigl( Z^{\tau}_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr), g_{t}^{\tau}( \theta_{0}) \bigr)^{\tau}\stackrel {d}{\longrightarrow}N \bigl(0,c^{\tau}Mc \bigr). $$

For simplicity, we write \(c^{\tau}( Z^{\tau}_{t}(X_{t}-Z^{\tau}_{t}\alpha^{0}), g_{t}^{\tau}(\theta_{0}) )^{\tau}\) for \(G_{t,c}(\theta_{0})\). Further, let \(\xi_{nt}=\frac{1}{\sqrt {n}}G_{t,c}(\theta_{0})\) and \(\mathscr{F}_{nt}=\sigma\) (\(\xi_{nr}\), \(1\leq r \leq t\)). Then \(\{\sum_{t=1}^{n}\xi_{nt},\mathscr{F}_{nt}, 1\leq t\leq n, n\geq1\} \) is a zero-mean, square integrable martingale array. By making use of a martingale central limit theorem [18], it suffices to show that

$$\begin{aligned}& \max_{1\leq t\leq n}|\xi_{nt}|\stackrel{p}{ \longrightarrow}0, \end{aligned}$$
$$\begin{aligned}& \sum_{t=1}^{n} \xi^{2}_{nt} \stackrel{p}{\longrightarrow}c^{\tau}Mc, \end{aligned}$$
$$\begin{aligned}& E \Bigl(\max_{1\leq t\leq n}\xi^{2}_{nt} \Bigr) \mbox{ is bounded in } n , \end{aligned}$$

and the fields are nested:

$$ \mathscr{F}_{nt}\subseteq\mathscr{F}_{(n+1)t} \quad\mbox{for } 1\leq t\leq n, n\geq1. $$

Note that (4.14) is obvious. In what follows, we first consider (4.11). By a simple calculation, we have, for all \(\varepsilon>0\),

$$\begin{aligned} &P \Bigl\{ \max_{1\leq t\leq n}| \xi_{nt}|> \varepsilon \Bigr\} \\ &\quad\leq\sum_{t=1}^{n}P\bigl\{ | \xi_{nt}|>\varepsilon\bigr\} \\ &\quad= \sum_{t=1}^{n}P \biggl\{ \biggl| \frac{1}{\sqrt{n}}G_{t,c}(\theta _{0})\biggr|>\varepsilon \biggr\} \\ &\quad=n P \bigl\{ \bigl|G_{t,c}(\theta_{0})\bigr|>\sqrt{n} \varepsilon \bigr\} \\ &\quad=n\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt{n}\varepsilon \bigr)\, d P \\ &\quad\leq n\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt{n}\varepsilon \bigr)\frac{(G_{t,c}(\theta_{0}))^{2}}{ (\sqrt{n\varepsilon})^{2}}\,d P \\ &\quad=\frac{1}{\varepsilon^{2}}\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt {n}\varepsilon \bigr) \bigl(G_{t,c}( \theta_{0}) \bigr)^{2}\,d P. \end{aligned}$$

Now by the Lebesgue control convergence theorem, we immediately find that (4.15) converges to 0 as \(n\rightarrow\infty\). This settles (4.11).

Next consider (4.12). By the ergodic theorem, we have

$$\begin{aligned} \sum_{t=1}^{n} \xi^{2}_{nt} =& \sum_{t=1}^{n} \biggl( \frac{1}{\sqrt{n}}G_{t,c}(\theta_{0}) \biggr)^{2} \\ \stackrel{\mathrm{a.s.}}{\longrightarrow}&E \bigl(G_{t,c}( \theta_{0}) \bigr)^{2} \\ =&c^{\tau}M c. \end{aligned}$$

Hence (4.12) is proved.

Finally, consider (4.13). Note that \(\{( \frac{1}{\sqrt{n}}G_{t,c}(\theta_{0}) )^{2},t\geq1\}\) is a stationary sequence. Then we have

$$\begin{aligned} E \Bigl(\max_{1\leq t\leq n}\xi^{2}_{nt} \Bigr) =&E \biggl(\max_{1\leq t\leq n} \biggl( \frac{1}{\sqrt{n}}G_{t,c}( \theta_{0}) \biggr)^{2} \biggr) \\ \leq&\frac{1}{n}E \Biggl( \sum_{t=1}^{n} \bigl(G_{t,c}(\theta _{0}) \bigr)^{2} \Biggr) \\ =&\frac{1}{n}\sum_{t=1}^{n}E \bigl(G_{t,c}(\theta_{0}) \bigr)^{2} \\ =&c^{\tau}Mc. \end{aligned}$$

This proves that (4.13). Thus, we complete the proof of Lemma 4.2. □

Lemma 4.3

Assume that (A1) and (A2) hold. Then

$$\lambda_{\theta_{0}}=O_{p} \bigl(n^{-\frac{1}{2}} \bigr). $$


Let \(\lambda_{\theta_{0}}=\zeta\beta_{0}\), where \(\|\beta_{0}\|=1\) is a unit vector and \(\zeta=\|\lambda_{\theta_{0}}\|\). Then (2.3) implies that

$$\begin{aligned} 0 =&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} \frac{g_{t}(\theta _{0})}{1+\zeta\beta^{\tau}_{0}g_{t}(\theta_{0})} \\ =&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n}g_{t}( \theta_{0})-\frac {\zeta}{n}\sum_{t=1}^{n} \frac{(\beta^{\tau}_{0}g_{t}(\theta _{0}))^{2}}{1+\zeta\beta^{\tau}_{0}g_{t}(\theta_{0})} \\ \leq&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n}g_{t}( \theta _{0})-\frac{\zeta}{1+\zeta\max_{1\leq t \leq n}\|g_{t}(\theta_{0})\| }\beta^{\tau}_{0} \Sigma_{n}(\theta_{0})\beta_{0}. \end{aligned}$$

This implies that

$$\begin{aligned} \zeta\beta^{\tau}_{0}\Sigma_{n}( \theta)\beta_{0}- \max_{1\leq t \leq n}\bigl\| g_{t}( \theta_{0})\bigr\| \frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0}) \leq\frac{\beta_{0}^{\tau}}{n}\sum _{t=1}^{n} g_{t}^{\tau}( \theta_{0}). \end{aligned}$$

Note that

$$\begin{aligned} \Biggl|\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0})\Biggr|\leq\Biggl\| \frac{1}{n} \sum_{t=1}^{n} g_{t}^{\tau}( \theta_{0})\Biggr\| =O_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}$$

By using Lemma 4.1 and (4.18), we can obtain

$$\begin{aligned} \max_{1\leq t \leq n}\bigl\| g_{t}( \theta_{0})\bigr\| \frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0})=o_{p}(1). \end{aligned}$$

By (4.3), we have

$$\begin{aligned} \beta_{0}^{\tau}\Sigma_{n}( \theta_{0})\beta_{0}\stackrel{p}{\longrightarrow}\beta _{0}^{\tau}\Sigma(\theta_{0})\beta_{0}. \end{aligned}$$

By this, together with (4.17)-(4.19), we can prove Lemma 4.3. □

Proof of Theorem 2.1

By (2.3), we have

$$\begin{aligned} \lambda_{\theta_{0}}= \bigl(\Sigma_{n}( \theta_{0}) \bigr)^{-1}\frac{1}{n}\sum _{t=1}^{n} g_{t}(\theta_{0})+ \bigl(\Sigma_{n}(\theta_{0}) \bigr)^{-1}R_{n}( \theta_{0}), \end{aligned}$$


$$R_{n}(\theta_{0})=\frac{1}{n}\sum _{t=1}^{n} g_{t}^{\tau}( \theta_{0})\frac {(\lambda^{\tau}_{\theta_{0}}g_{t}(\theta_{0}))^{2}}{1+\lambda^{\tau}_{\theta _{0}}g_{t}(\theta_{0})}. $$

By Lemmas 4.1-4.3, we know that

$$\begin{aligned} R_{n}(\theta_{0})=o_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}$$

This implies uniformly for t

$$\begin{aligned} \omega_{t}(\theta_{0}) =&\frac{1}{n} \frac{1}{1+\lambda_{\theta _{0}}^{\tau}g_{t}(\theta_{0})} \\ =&\frac{1}{n} \bigl( 1- \lambda_{\theta_{0}}^{\tau}g_{t}(\theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr). \end{aligned}$$

Moreover, note that

$$\sqrt{n} \bigl(\hat{\alpha}-\alpha^{0} \bigr)=T_{n}^{-1}S_{n}, $$

where \(T_{n}=\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}Z^{\tau}_{t} \) and \(S_{n}=\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}(X_{t}-Z^{\tau}_{T}\alpha^{0})\).

First, we consider \(T_{n}\). By (4.23), we have

$$\begin{aligned} T_{n} =&\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}Z^{\tau}_{t} \\ =&\frac{1}{n}\sum_{t=1}^{n} \bigl( 1- \lambda_{\theta_{0}}^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr)Z_{t}Z^{\tau}_{t} \\ =&\frac{1}{n}\sum_{t=1}^{n}Z_{t}Z^{\tau}_{t} -\frac{1}{n}\sum_{t=1}^{n} \bigl(Z_{t}Z^{\tau}_{t} \bigr)\otimes \bigl( \bigl( \bigl(\Sigma_{n}(\theta _{0}) \bigr)^{-1}R_{n}( \theta_{0}) \bigr)^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr) \\ &{}-\frac{1}{n}\sum_{t=1}^{n} \bigl(Z_{t}Z^{\tau}_{t} \bigr)\otimes \Biggl( \Biggl( \bigl(\Sigma _{n}(\theta_{0}) \bigr)^{-1} \frac{1}{n}\sum_{t=1}^{n} g_{t}(\theta_{0}) \Biggr)^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \Biggr) \\ \doteq&U_{1}-U_{2}-U_{3}. \end{aligned}$$

Then by (4.3) and (4.22), conditions (A1) and (A2), Lemma 4.2, and the ergodic theorem, we can prove that

$$ U_{2}=o_{p}(1). $$

Similarly, we can obtain

$$ U_{3}\stackrel{\mathrm{a.s.}}{\longrightarrow}0. $$

This, together with (4.25), yields

$$ T_{n}\stackrel{p}{\longrightarrow}W. $$

Next consider \(S_{n}\). By (4.23), we have

$$\begin{aligned} S_{n} =&\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\theta _{0}})Z_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr) \\ =&\frac{1}{\sqrt{n}}\sum_{t=1}^{n} \bigl( 1- \lambda_{\theta _{0}}^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr)Z_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr). \end{aligned}$$

Thus, combining with (4.21), (4.22), we can obtain

$$\begin{aligned} S_{n} =&\frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr) \\ &{}-\frac{1}{n}\sum_{t=1}^{n} \bigl(X_{t}-Z^{\tau}_{t}\alpha ^{0} \bigr)Z_{t}g_{t}^{\tau}(\theta_{0}) \bigl( \Sigma_{n}(\theta_{0}) \bigr)^{-1} \frac{1}{\sqrt{n}}\sum_{t=1}^{n} g_{t}(\theta_{0})+o_{p}(1). \end{aligned}$$

By the ergodic theorem, we have

$$\frac{1}{n}\sum_{t=1}^{n} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr)Z_{t}g_{t}^{\tau}(\theta _{0}) \stackrel{\mathrm{a.s.}}{\longrightarrow} \Lambda_{12}. $$

This, together with (4.3) and Lemma 4.2, proves that

$$\begin{aligned} S_{n}\stackrel{d}{\longrightarrow}N \bigl(0,\Lambda- \Lambda_{12}\Sigma ^{-1}\Lambda^{\tau}_{12} \bigr), \end{aligned}$$

which, combining with (4.27), proves Theorem 2.1. □

Proof of Theorem 2.2

Similar to the proof of Lemma 1 and Theorem 1 in Qin and Lawless [16], we can prove that

$$\begin{aligned} \lambda_{\hat{\theta}}=B\frac{1}{n}\sum _{t=1}^{n}g_{t}(\theta _{0})+o_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}$$

Moreover, note that

$$ \sqrt{n} \bigl(\hat{\alpha}^{1}-\alpha^{0} \bigr)=\tilde{T}_{n}^{-1}\tilde{S}_{n}, $$


$$\tilde{T}_{n}=\sum_{t=1}^{n} \omega_{t}({\hat{\theta}})Z_{t}Z^{\tau}_{t} $$


$$\tilde{S}_{n}=\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\hat{\theta }})Z_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr). $$

By an argument similar to the proof of Theorem 2.1, we can prove that

$$\tilde{T}_{n}\stackrel{p}{\longrightarrow}W $$


$$ \tilde{S}_{n}\stackrel{d}{\longrightarrow}N \bigl(0, \Lambda-\Lambda_{12}B\Lambda ^{\tau}_{12} \bigr), $$

showing (2.7). The proof of Theorem 2.2 is thus completed. □


  1. Alzaid, AA, Al-Osh, M: An integer-valued pth-order autoregressive structure (INAR(p)) process. J. Appl. Probab. 27, 314-324 (1990)

    MATH  MathSciNet  Article  Google Scholar 

  2. Davis, RA, Dunsmuir, WTM, Streett, SB: Observation-driven models for Poisson counts. Biometrika 90, 777-790 (2003)

    MathSciNet  Article  Google Scholar 

  3. Zheng, H, Basawa, IV, Datta, S: Inference for pth-order random coefficient integer-valued autoregressive processes. J. Time Ser. Anal. 27, 411-440 (2006)

    MATH  MathSciNet  Article  Google Scholar 

  4. Weiß, CH: Thinning operations for modeling time series of counts - a survey. AStA Adv. Stat. Anal. 92, 319-341 (2008)

    MathSciNet  Article  Google Scholar 

  5. Ferland, R, Latour, A, Oraichi, D: Integer-valued GARCH process. J. Time Ser. Anal. 27, 923-942 (2006)

    MATH  MathSciNet  Article  Google Scholar 

  6. Zhu, F, Wang, D: Estimation and testing for a Poisson autoregressive model. Metrika 73, 211-230 (2011)

    MATH  MathSciNet  Article  Google Scholar 

  7. Hansen, LP: Large sample properties of generalized method of moments estimators. Econometrica 50, 1029-1054 (1982)

    MATH  MathSciNet  Article  Google Scholar 

  8. Isaki, CT: Variance estimation using auxiliary information. J. Am. Stat. Assoc. 78, 117-123 (1983)

    MATH  MathSciNet  Article  Google Scholar 

  9. Kuk, AYC, Mak, TK: Median estimation in the presence of auxiliary information. J. R. Stat. Soc. B 51, 261-269 (1989)

    MATH  MathSciNet  Google Scholar 

  10. Rao, JNK, Kovar, JG, Mantel, HJ: On estimating distribution functions and quantiles from survey data using auxiliary information. Biometrika 77, 365-375 (1990)

    MATH  MathSciNet  Article  Google Scholar 

  11. Owen, AB: Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237-249 (1988)

    MATH  MathSciNet  Article  Google Scholar 

  12. Owen, AB: Empirical likelihood and confidence region. Ann. Stat. 18, 90-120 (1990)

    MATH  Article  Google Scholar 

  13. Chen, J, Qin, J: Empirical likelihood estimation for finite populations and the effective usage of auxiliary information. Biometrika 80, 107-116 (1993)

    MATH  MathSciNet  Article  Google Scholar 

  14. Zhang, B: M-Estimation and quantile estimation in the presence of auxiliary information. J. Stat. Plan. Inference 44, 77-94 (1995)

    MATH  Article  Google Scholar 

  15. Tang, CY, Leng, C: An empirical likelihood approach to quantile regression with auxiliary information. Stat. Probab. Lett. 82, 29-36 (2012)

    MATH  MathSciNet  Article  Google Scholar 

  16. Qin, J, Lawless, J: Empirical likelihood and general estimating equations. Ann. Stat. 22, 300-325 (1994)

    MATH  MathSciNet  Article  Google Scholar 

  17. Owen, AB: Empirical Likelihood. Chapman & Hall/CRC, London (2001)

    MATH  Book  Google Scholar 

  18. Hall, P, Heyde, CC: Martingale Limit Theory and Its Application. Academic Press, New York (1980)

    MATH  Google Scholar 

Download references


This work is supported by National Natural Science Foundation of China (Nos. 11271155, 11001105, 11071126, 10926156, 11071269), Specialized Research Fund for the Doctoral Program of Higher Education (Nos. 20110061110003, 20090061120037), Scientific Research Fund of Jilin University (Nos. 201100011, 200903278), the Science and Technology Development Program of Jilin Province (201201082), Jilin Province Social Science Fund (2012B115), and Jilin Province Natural Science Foundation (20101596, 20130101066JC).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Dehui Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

The authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Peng, C., Wang, D. & Zhao, Z. Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions. J Inequal Appl 2015, 218 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • 62M10
  • 62M05


  • conditional moment restriction
  • empirical likelihood
  • least square estimator
  • asymptotic distribution
  • Poisson autoregressive model