# Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions

## Abstract

In this paper, we employ the empirical likelihood method to estimate the unknown parameters in Poisson autoregressive model in the presence of auxiliary information. It is shown that our approach proposed, compared to the maximum likelihood estimator, the least squares estimator, and the weighted least squares estimator, yields more efficient estimators. Some simulation studies are also conducted to investigate the finite sample performance of the proposed method.

## 1 Introduction

Integer-valued time series occur in many different situations. They arise, for example, as the number of births at a hospital in successive months, the number of bases of DNA sequences, the number of road accidents and the number of diseases in a certain area in successive months. Therefore, in recent years, there has been growing interest in studying integer-valued time series [1â€“4]. In order to model the number of cases of campylobacterosis infections from January 1990 to the end of October 2000 in the north of the Province of QuÃ©bec, Ferland et al. [5] proposed the following Poisson autoregressive model:

$$\left \{ \textstyle\begin{array}{@{}l} X_{t}|\mathscr{F}_{t-1}: \mathcal{P}(\lambda_{t}); \quad \forall t\in\mathbb{Z},\\ \lambda_{t}=\alpha_{0}+\sum_{i=1}^{p}\alpha_{i}X_{t-i}, \end{array}\displaystyle \right .$$
(1.1)

where $$\mathscr{F}_{t-1}$$ is the Ïƒ field generated by $$(X_{t-1},X_{t-2},\ldots)$$, $$\alpha_{0}>0$$, $$\alpha_{i}\geq0$$ ($$i=1, 2, \ldots , p$$), and $$\alpha=(\alpha_{0}, \alpha_{1}, \ldots, \alpha_{p})$$ is an unknown parameter vector.

For model (1.1), Zhu and Wang [6] gave the condition for ergodicity and a necessary and sufficient condition for the existence of moments. They also established the asymptotics for the maximum likelihood estimator and the least squares estimators. The problem of interest here is to estimate the unknown parameter in model (1.1) by using the empirical likelihood method when auxiliary information is available. In practice, some auxiliary information can often be obtained, such as that the unknown distribution is symmetric or the variance of population is a function of the mean. By making full use of the auxiliary information, we can increase the precision of statistical inference. Here, we assume that we have some auxiliary information that can be represented as the following conditional moment restrictions:

$$E \bigl(g(X_{t},\ldots,X_{t-p}; \theta_{0})|X(t-1) \bigr)=0$$
(1.2)

for each $$t=0, 1, 2,\ldots$$â€‰, where the unknown parameter vector $$\theta _{0}\in R^{d}$$, $$X(t-1)=(X_{t-1}, \ldots, X_{t-p})$$ and $$g(x;\theta)\in R^{r}$$ is some function with $$r\geq d$$. In order to simplify the notation, we further denote $$g(X_{t},\ldots ,X_{t-p};\theta)$$ by $$g_{t}(\theta)$$. We note here that Î¸ can be different from Î± and the notion $$g_{t}(\theta)$$ contains a broad class of information that can be formulated from the knowledge on the probability distribution of $$X_{t}$$, e.g. the moment and their generalizations [7]. By using the conditional moment restrictions, as we expect, we can increase the efficiency of the resulting estimator [8â€“10].

The EL as an alternative to the bootstrap for constructing confidence regions was introduced by Owen [11, 12]. The method defines an EL ratio function to construct confidence regions. Important features of the empirical likelihood method are its automatic determination of the shape and orientation of the confidence region by the data. These attractive properties have motivated various authors to extend the empirical likelihood methodology to other situations. To use the auxiliary information, some statisticians have also developed some statistical inference methods under the framework of empirical likelihood method [13â€“15]. In this paper, we further generalize these methods to the statistical inference of time series models. Specifically, based on the empirical likelihood method, we consider the parameter estimation problem for Poisson autoregressive model with conditional moment restrictions. Our approach yields more efficient estimates compared to the maximum likelihood estimator, the least squares estimator and the weighted least squares estimator, which do not utilize the conditional moment restrictions. Based on the mean square errors, a comparison is also made by simulation. Our simulation indicates that the use of auxiliary information provides improved inferences.

The rest of this paper is organized as follows. In SectionÂ 2, we introduce the methodology and the main results. Simulation results are reported in SectionÂ 3. SectionÂ 4 provides the proofs of the main results.

The symbols â€˜$$\stackrel{d}{\longrightarrow}$$â€™ and â€˜$$\stackrel{p}{\longrightarrow}$$â€™ denote convergence in distribution and convergence in probability, respectively. Convergence â€˜almost surelyâ€™ is written as â€˜a.s.â€™â€‰. Furthermore, â€˜$$M^{\tau}_{k\times p}$$â€™ denotes the transpose matrix of the $$k\times p$$ matrix $$M_{k\times p}$$, $$A\otimes B$$ denotes the Kronecker product of matrices A and B, and $$\|\cdot\|$$ denotes Euclidean norm of the matrix or vector.

## 2 Methodology and main results

In this section, we will first discuss how to apply the empirical likelihood method [16, 17] to estimate the unknown parameter Î± when auxiliary information is available.

Before we state our main results, the following assumptions will be made:

(A1):

The parametric space Ï’ is compact with $$\Upsilon=\{\alpha: \delta\leq\alpha_{0}\leq M, 0\leq \alpha_{1}+\cdots +\alpha_{p}\leq M^{\ast}<1, \alpha_{i}\geq0, i=1, 2, \ldots, p \}$$, where Î´ and M are finite positive constants, and the true parameter value $$\alpha^{0}$$ is an interior point in Ï’.

### Remark 1

It is shown by CorollaryÂ 1 in Ferland et al. [5] and TheoremÂ 1 and TheoremÂ 2 in Zhu and Wang [6] that, under (A1), $$\{X_{t}, t\geq1\}$$ is stationary and ergodic, and $$E(X_{t}^{m})<\infty$$ for any fixed positive integer m.

(A2):

There exists $$\theta_{0}$$ such that $$E(g_{t}(\theta_{0}))=0$$, the matrix $$\Sigma(\theta)=E(g_{t}(\theta) g_{t}^{\tau}(\theta))$$ is positive definite at $$\theta_{0}$$, $$\partial g(x;\theta) /\partial\theta$$ is continuous in a neighborhood of the true value $$\theta_{0}$$, $$\| \partial g(x;\theta) /\partial\theta\|$$ and $$\|g(x;\theta)\|^{3}$$ are bounded by some integrable function $$\tilde{W}(x)$$ in this neighborhood, and the rank of $$E(\partial g_{t}(\theta) /\partial\theta)$$ is d.

First, the conditional moment restrictions in (1.2) imply that $$E(g_{t}(\theta_{0}))=0$$. Further, by using the empirical likelihood method, we can obtain data adaptive weights $$\omega_{t}$$ through

\begin{aligned} L(\theta) =\sup \Biggl\{ \prod_{t=1}^{n} \omega_{t}: \omega_{t}\geq0, \sum _{i=1}^{n}\omega_{t}=1, \sum _{t=1}^{n}\omega_{t}g_{t}(\theta _{0})=0 \Biggr\} , \end{aligned}

where $$\theta_{0}$$ is an unknown parameter. By using the auxiliary information combining with the least squares method, we propose to estimate Î± by

$$\hat{\alpha}=\arg\min_{\alpha}\sum _{t=1}^{n}\omega _{t} \bigl(X_{t}-Z_{t}^{\tau}\alpha \bigr)^{2},$$
(2.1)

where $$Z_{t}^{\tau}=(1, X_{t-1}, \ldots, X_{t-p})$$. By introducing a Lagrange multiplier $$\lambda\in R^{r}$$, standard derivations in the empirical likelihood lead to

$$\omega_{t}(\theta_{0})=\frac{1}{n} \frac{1}{1+\lambda_{\theta_{0}}^{\tau}g_{t}(\theta_{0})},$$
(2.2)

where $$\lambda_{\theta_{0}}$$ satisfies

$$\frac{1}{n}{}\sum_{t=1}^{n} \frac{g_{t}(\theta_{0})}{1+\lambda_{\theta _{0}}^{\tau}g_{t}(\theta_{0})}=0.$$
(2.3)

Utilizing the weights given by (2.2), we obtain the estimate of Î±:

$$\hat{\alpha}= \Biggl(\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t} Z_{t}^{\tau}\Biggr)^{-1} \sum _{t=1}^{n} \omega_{t}({ \theta_{0}})X_{t}Z_{t}.$$
(2.4)

In the following, we will give the asymptotic properties of $$\hat{\alpha}$$.

### Theorem 2.1

Assume that (A1) and (A2) hold. If $$\alpha^{0}$$ is the true value of Î±, then

\begin{aligned} \sqrt{n} \bigl(\hat{\alpha}-\alpha^{0} \bigr)\stackrel{d}{ \longrightarrow }N \bigl(0, W^{-1} \bigl(\Lambda-\Lambda_{12} \Sigma^{-1}(\theta_{0})\Lambda^{\tau}_{12} \bigr)W^{-1} \bigr), \end{aligned}

where $$W=E(Z_{t} Z_{t}^{\tau})$$, $$\Lambda=E(Z_{t} Z_{t}^{\tau}(X_{t}-Z_{t}^{\tau}\alpha ^{0})^{2})$$, and $$\Lambda_{12}=E(Z_{t}g_{t}^{\tau}(\theta_{0})(X_{t}-Z_{t}^{\tau}\alpha^{0}))$$.

Zhu and Wang [6] prove that the asymptotic variance of the ordinary least squares estimator $$\bar{\alpha}=(\sum_{t=1}^{n}Z_{t} Z_{t}^{\tau})^{-1}\sum_{t=1}^{n} X_{t}Z_{t}$$ is $$W^{-1}\Lambda W^{-1}$$. Note that Î› and $$\Sigma^{-1}(\theta_{0})$$ are both positive definite matrices. Therefore, by TheoremÂ 2.1, we find that a variance reduction quantified by $$W^{-1}\Lambda_{12}\Sigma ^{-1}(\theta_{0})\Lambda^{\tau}_{12}W^{-1}$$ is induced by incorporating the auxiliary information $$E(g_{t}(\theta_{0})|X(t-1))=0$$.

To apply the proposed estimator (2.4), we need to further estimate the unknown parameter Î¸. We consider $$\hat{\theta}=\arg \max_{\theta}L(\theta)$$. Following the results in Qin and Lawless [16], the corresponding weights $$\{\omega_{t}\}_{t=1}^{n}$$ satisfy

$$\omega_{t}(\hat{\theta})=\frac{1}{n} \frac{1}{1+\lambda_{\hat{\theta }}^{\tau}g_{t}(\hat{\theta})},$$
(2.5)

where $$\lambda_{\hat{\theta}}$$ is the solution to

$$\frac{1}{n}\sum_{t=1}^{n} \frac{g_{t}(\hat{\theta})}{1+\lambda_{\hat {\theta}}^{\tau}g_{t}(\hat{\theta})}=0$$

and $$(\lambda_{\hat{\theta}},\hat{\theta})$$ solves

$$\frac{1}{n}\sum_{t=1}^{n} \frac{\frac{\partial g_{t}( {\theta })}{\partial\theta^{\tau}}\lambda_{\hat{\theta}}}{1+\lambda_{\hat{\theta }}^{\tau}g_{t}(\hat{\theta})}=0.$$

Let

\begin{aligned} \hat{\alpha}^{1}=\arg\min_{\alpha}\sum _{t=1}^{n}\omega_{t}(\hat { \theta}) \bigl(X_{t}-Z_{t}^{\tau}\alpha \bigr)^{2}, \end{aligned}
(2.6)

where $$\{\omega_{t}(\hat{\theta})\}_{t=1}^{n}$$ is identified by (2.5). When $$r=d$$, we know that $$\omega_{t}(\hat{\theta})=\frac{1}{n}$$ and hence $$\hat{\alpha}^{1}$$ is the ordinary least squares estimator. When $$r>d$$, $$\omega_{t}(\hat{\theta})$$ is no longer equal to $$\frac{1}{n}$$ and we shall show that this scheme provides an efficiency gain over the conventional least squares estimator.

In order to study the estimator (2.6), we define $$\Gamma(\theta _{0})=E(\frac{\partial g_{t}(\theta_{0})}{\partial\theta})$$, $$\Omega(\theta _{0})=(\Gamma(\theta_{0})\Sigma^{-1}(\theta_{0}) \Gamma^{\tau}(\theta_{0}) )^{-1}$$ and $$B=\Sigma^{-1}(\theta_{0})(I-\Gamma(\theta_{0})\Omega(\theta_{0})\Gamma ^{\tau}(\theta_{0})\Sigma^{-1}(\theta_{0}) )$$, where I is the identity matrix. The limiting distribution of $$\hat{\alpha}^{1}$$ is given in the following theorem.

### Theorem 2.2

Assume that (A1) and (A2) hold. If $$\alpha^{0}$$ is the true value of Î±, then

\begin{aligned} \sqrt{n} \bigl(\hat{\alpha}^{1}-\alpha^{0} \bigr)\stackrel{d}{\longrightarrow}N \bigl(0, W^{-1} \bigl(\Lambda- \Lambda_{12}B\Lambda^{\tau}_{12} \bigr)W^{-1} \bigr). \end{aligned}
(2.7)

The matrix B is non-negative definite. Hence the asymptotic variance of $$\hat{\alpha}^{1}$$ is no greater than that of the least squares estimator. When B is positive definite, variance reduction is attained. This implies that having much more auxiliary information can improve the least squares estimator.

## 3 Simulation study

In this section we conduct some simulation studies which show that our proposed methods perform very well. Consider the following one order Poisson autoregressive model:

$$\left \{ \textstyle\begin{array}{@{}l} X_{t}|\mathscr{F}_{t-1}: \mathcal{P}(\lambda_{t}); \quad \forall t\in\mathbb{Z},\\ \lambda_{t}=\alpha_{0}+ \alpha_{1}X_{t-1}. \end{array}\displaystyle \right .$$
(3.1)

In order to compare the performance of the estimator (denoted by ALS) given by (2.6) with those of the ordinary least squares estimator (LS), the weighted least squares estimator (WLS), and the maximum likelihood estimation (MLE), we compute the mean square errors based on the four methods: the ALS, the LS, the WLS, and the MLE. We use the vector $$g_{t}(\alpha_{0}, \alpha_{1})=(1, X_{t-1}, X^{2}_{t-1})^{\tau}(X_{t}-\alpha_{0}-\alpha_{1} X_{t-1})$$ as the conditional moment restrictions in (1.2). Specifically, for a particular pair of $$(\alpha_{0}, \alpha_{1})^{\tau}$$, we generate realizations from (3.1) with $$n=100, 300\mbox{ and }1{,}000$$. Further, based on 1,000 repetitions, we compute the mean square errors of the above four kinds of estimators. The simulation results for $$\alpha_{0}=1$$ are summarized in TableÂ 1. TableÂ 2 presents the simulation results for $$\alpha_{0}=2$$.

From TablesÂ 1 and 2, we see that the mean square errors obtained by the estimator (2.6) are less than those of the maximum likelihood estimator, the least squares estimator and the weighted least squares estimator. This indicates that using the conditional moment restrictions, the estimates are more accurate, regardless of the samples size and the different unknown parameter.

## 4 Proofs of the main results

In order to prove TheoremÂ 2.1, we first present several lemmas.

### Lemma 4.1

Assume that (A1) and (A2) hold. Then

$$\max_{1\leq t\leq n} \bigl\| g_{t}( \theta_{0})\bigr\| =o_{p} \bigl(n^{\frac{1}{2}} \bigr).$$
(4.1)

### Proof

Let $$\Sigma_{n}(\theta_{0})=\frac{1}{n}\sum_{t=1}^{n}(g_{t}(\theta_{0})g_{t}^{\tau}(\theta_{0}))$$ and $$g_{t}(\theta_{0})=( g_{t1}(\theta_{0}),\ldots, g_{tr}(\theta_{0}))^{\tau}$$. Note that in order to prove (4.1), we need only to prove that

$$\frac{1}{n}\max_{1\leq t\leq n} g_{tk}( \theta_{0})\stackrel{p}{\longrightarrow}0,\quad k=1,\ldots,r.$$
(4.2)

We denote the $$(h,l)$$th element of $$\Sigma(\theta_{0})$$ as $$\sigma_{hl}$$, $$h,l=1,2,\ldots,r$$. By the ergodic theorem, we have

$$\Sigma_{n}(\theta_{0})\stackrel{p}{ \longrightarrow}\Sigma(\theta_{0}).$$
(4.3)

For all $$1\leq k\leq r$$ and $$1\leq m\leq n$$, define the sets

$$B^{k}_{n, m}=\bigcap_{j=1}^{m}B^{k,j}_{n, m},$$

where $$B^{k,j}_{n, m}=\{\omega:|\frac{1}{n}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}(\theta _{0})-\frac{j}{m}\sigma_{kk} |\leq\frac{1}{m}\}$$, $$j=1,\ldots,m$$.

First we show that

$$\lim_{n\rightarrow\infty}P \bigl(B^{k,j}_{n, m} \bigr)=1,\quad j=1, \ldots, m.$$
(4.4)

After some algebra, we obtain

$$B^{k,j}_{n,m}= \Biggl\{ \omega:\Biggl|\frac{[n\frac{j}{m}]}{n} \frac{m}{j}\frac {1}{[n\frac{j}{m}]}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}( \theta_{0})-\sigma_{kk} \Biggr|\leq\frac{1}{m} \frac{m}{j} \Biggr\} .$$

Observe that

$$\frac{[n\frac{j}{m}]}{n}\frac{m}{j}\rightarrow1,\quad n \rightarrow \infty.$$
(4.5)

Furthermore, by (4.3), we have

$$\frac{1}{[n\frac{j}{m}]}\sum_{t=1}^{[n(j/m)]}g^{2}_{tk}( \theta _{0})\stackrel{p}{\longrightarrow}\sigma_{kk}.$$
(4.6)

Using (4.5) and (4.6), we complete the proof of (4.4).

Next we prove that

$$\lim_{n\rightarrow\infty}P \bigl(B^{k}_{n, m} \bigr)=1,\quad m=1, 2,\ldots.$$
(4.7)

After some calculation, we obtain

\begin{aligned} P \bigl(B^{k}_{n, m} \bigr) =&P \Biggl( \bigcap _{j=1}^{m}B^{k,j}_{n, m} \Biggr) \\ =&P \Biggl(\bigcap_{j=1}^{m-1}B^{k,j}_{n, m} \Biggr)+P \bigl(B^{k,m}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-1}B^{k,j}_{n, m} \Biggr)\cup B^{k,m}_{n, m} \Biggr) \\ =&\cdots \\ =&P \bigl(B^{k,1}_{n, m} \bigr)+P \bigl(B^{k,2}_{n, m} \bigr) -P \bigl(B^{k,1}_{n, m}\cup P \bigl(B^{k,2}_{n, m} \bigr) \bigr)+\cdots \\ &{}+P \bigl(B^{k,{m-1}}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-2}B^{k,j}_{n, m} \Biggr)\cup B^{k,{m-1}}_{n, m} \Biggr) \\ &{}+P \bigl(B^{k,{m}}_{n, m} \bigr)-P \Biggl( \Biggl(\bigcap _{j=1}^{m-1}B^{k,j}_{n, m} \Biggr) \cup B^{k,{m}}_{n, m} \Biggr). \end{aligned}
(4.8)

For all $$2\leq i\leq m$$, (4.4) implies that

$$\lim_{n\rightarrow\infty}P \Biggl( \Biggl(\bigcap _{j=1}^{i-1}B^{k,j}_{n, m} \Biggr)\cup B^{k,{i}}_{n, m} \Biggr)=1.$$

Thus, again by (4.4), (4.7) can be proved.

Finally we prove (4.2).

Note that

$$\frac{1}{n}\max_{1\leq t\leq n} g^{2}_{tk}( \theta_{0})\leq\frac{1}{n}\sup_{s\in[0,1]}\sum _{t=[ns]+1}^{[n(s+\frac{1}{m})]}g^{2}_{tk}( \theta_{0}).$$

For given $$s\in[0,1]$$ choose $$j\in\{1,\ldots,m\}$$ so that $$s\in[\frac{j-1}{m},\frac{j}{m}]$$. Then, for each $$s\in[0,1]$$, $$\omega\in B^{k}_{n, m}$$ implies

\begin{aligned} \frac{1}{n}\sum_{t=[ns]+1}^{[n(s+\frac{1}{m})]}g^{2}_{tk}( \theta _{0}) \leq&\frac{1}{n}\sum_{t=[n(j-1)/m]+1}^{[n(j+1)/m]}g^{2}_{tk}( \theta_{0}) \\ =& \Biggl(\frac{1}{n}\sum_{t=1}^{[n(j+1)/m]}g^{2}_{tk}( \theta _{0})-\frac{j+1}{m}\sigma_{kk} \Biggr) \\ &{}- \Biggl(\frac{1}{n}\sum_{t=1}^{[n(j-1)/m]}g^{2}_{tk}( \theta _{0})-\frac{j-1}{m}\sigma_{kk} \Biggr)+ \frac{2}{m}\sigma_{kk} \\ \leq&\frac{2}{m}+\frac{2}{m}\sigma_{kk} \\ =&(1+\sigma_{kk})\frac{2}{m}. \end{aligned}
(4.9)

Therefore, for all $$m\geq1$$,

$$\lim_{n\rightarrow\infty} P \biggl\{ \frac{1}{n}\max _{1\leq t\leq n} g^{2}_{tk}(\theta_{0})\leq \frac{2}{m}(1+\sigma_{kk}) \biggr\} \geq \lim_{n\rightarrow\infty}P \bigl(B^{k}_{n, m} \bigr)=1,$$

showing (4.2). The proof of LemmaÂ 4.1 is thus completed.â€ƒâ–¡

### Lemma 4.2

Assume that (A1) and (A2) hold. Then

$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n} \bigl( Z^{\tau}_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr), g_{t}^{\tau}( \theta_{0}) \bigr)^{\tau}\stackrel{d}{\longrightarrow}N(0, M),$$

where

$$M=\left ( \textstyle\begin{array}{@{}c@{\quad}c@{}} \Lambda& \Lambda_{12}\\ \Lambda^{\tau}_{12} & \Sigma \end{array}\displaystyle \right ).$$
(4.10)

### Proof

By the Cramer-Wold device, it suffices to show that, for all $$c\in R^{(p+1+r)}\setminus(0,\ldots,0)$$,

$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n}c^{\tau}\bigl( Z^{\tau}_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr), g_{t}^{\tau}( \theta_{0}) \bigr)^{\tau}\stackrel {d}{\longrightarrow}N \bigl(0,c^{\tau}Mc \bigr).$$

For simplicity, we write $$c^{\tau}( Z^{\tau}_{t}(X_{t}-Z^{\tau}_{t}\alpha^{0}), g_{t}^{\tau}(\theta_{0}) )^{\tau}$$ for $$G_{t,c}(\theta_{0})$$. Further, let $$\xi_{nt}=\frac{1}{\sqrt {n}}G_{t,c}(\theta_{0})$$ and $$\mathscr{F}_{nt}=\sigma$$ ($$\xi_{nr}$$, $$1\leq r \leq t$$). Then $$\{\sum_{t=1}^{n}\xi_{nt},\mathscr{F}_{nt}, 1\leq t\leq n, n\geq1\}$$ is a zero-mean, square integrable martingale array. By making use of a martingale central limit theorem [18], it suffices to show that

\begin{aligned}& \max_{1\leq t\leq n}|\xi_{nt}|\stackrel{p}{ \longrightarrow}0, \end{aligned}
(4.11)
\begin{aligned}& \sum_{t=1}^{n} \xi^{2}_{nt} \stackrel{p}{\longrightarrow}c^{\tau}Mc, \end{aligned}
(4.12)
\begin{aligned}& E \Bigl(\max_{1\leq t\leq n}\xi^{2}_{nt} \Bigr) \mbox{ is bounded in } n , \end{aligned}
(4.13)

and the fields are nested:

$$\mathscr{F}_{nt}\subseteq\mathscr{F}_{(n+1)t} \quad\mbox{for } 1\leq t\leq n, n\geq1.$$
(4.14)

Note that (4.14) is obvious. In what follows, we first consider (4.11). By a simple calculation, we have, for all $$\varepsilon>0$$,

\begin{aligned} &P \Bigl\{ \max_{1\leq t\leq n}| \xi_{nt}|> \varepsilon \Bigr\} \\ &\quad\leq\sum_{t=1}^{n}P\bigl\{ | \xi_{nt}|>\varepsilon\bigr\} \\ &\quad= \sum_{t=1}^{n}P \biggl\{ \biggl| \frac{1}{\sqrt{n}}G_{t,c}(\theta _{0})\biggr|>\varepsilon \biggr\} \\ &\quad=n P \bigl\{ \bigl|G_{t,c}(\theta_{0})\bigr|>\sqrt{n} \varepsilon \bigr\} \\ &\quad=n\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt{n}\varepsilon \bigr)\, d P \\ &\quad\leq n\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt{n}\varepsilon \bigr)\frac{(G_{t,c}(\theta_{0}))^{2}}{ (\sqrt{n\varepsilon})^{2}}\,d P \\ &\quad=\frac{1}{\varepsilon^{2}}\int_{\Omega}I \bigl(\bigl|G_{t,c}( \theta_{0})\bigr|>\sqrt {n}\varepsilon \bigr) \bigl(G_{t,c}( \theta_{0}) \bigr)^{2}\,d P. \end{aligned}
(4.15)

Now by the Lebesgue control convergence theorem, we immediately find that (4.15) converges to 0 as $$n\rightarrow\infty$$. This settles (4.11).

Next consider (4.12). By the ergodic theorem, we have

\begin{aligned} \sum_{t=1}^{n} \xi^{2}_{nt} =& \sum_{t=1}^{n} \biggl( \frac{1}{\sqrt{n}}G_{t,c}(\theta_{0}) \biggr)^{2} \\ \stackrel{\mathrm{a.s.}}{\longrightarrow}&E \bigl(G_{t,c}( \theta_{0}) \bigr)^{2} \\ =&c^{\tau}M c. \end{aligned}

Hence (4.12) is proved.

Finally, consider (4.13). Note that $$\{( \frac{1}{\sqrt{n}}G_{t,c}(\theta_{0}) )^{2},t\geq1\}$$ is a stationary sequence. Then we have

\begin{aligned} E \Bigl(\max_{1\leq t\leq n}\xi^{2}_{nt} \Bigr) =&E \biggl(\max_{1\leq t\leq n} \biggl( \frac{1}{\sqrt{n}}G_{t,c}( \theta_{0}) \biggr)^{2} \biggr) \\ \leq&\frac{1}{n}E \Biggl( \sum_{t=1}^{n} \bigl(G_{t,c}(\theta _{0}) \bigr)^{2} \Biggr) \\ =&\frac{1}{n}\sum_{t=1}^{n}E \bigl(G_{t,c}(\theta_{0}) \bigr)^{2} \\ =&c^{\tau}Mc. \end{aligned}
(4.16)

This proves that (4.13). Thus, we complete the proof of LemmaÂ 4.2.â€ƒâ–¡

### Lemma 4.3

Assume that (A1) and (A2) hold. Then

$$\lambda_{\theta_{0}}=O_{p} \bigl(n^{-\frac{1}{2}} \bigr).$$

### Proof

Let $$\lambda_{\theta_{0}}=\zeta\beta_{0}$$, where $$\|\beta_{0}\|=1$$ is a unit vector and $$\zeta=\|\lambda_{\theta_{0}}\|$$. Then (2.3) implies that

\begin{aligned} 0 =&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} \frac{g_{t}(\theta _{0})}{1+\zeta\beta^{\tau}_{0}g_{t}(\theta_{0})} \\ =&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n}g_{t}( \theta_{0})-\frac {\zeta}{n}\sum_{t=1}^{n} \frac{(\beta^{\tau}_{0}g_{t}(\theta _{0}))^{2}}{1+\zeta\beta^{\tau}_{0}g_{t}(\theta_{0})} \\ \leq&\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n}g_{t}( \theta _{0})-\frac{\zeta}{1+\zeta\max_{1\leq t \leq n}\|g_{t}(\theta_{0})\| }\beta^{\tau}_{0} \Sigma_{n}(\theta_{0})\beta_{0}. \end{aligned}

This implies that

\begin{aligned} \zeta\beta^{\tau}_{0}\Sigma_{n}( \theta)\beta_{0}- \max_{1\leq t \leq n}\bigl\| g_{t}( \theta_{0})\bigr\| \frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0}) \leq\frac{\beta_{0}^{\tau}}{n}\sum _{t=1}^{n} g_{t}^{\tau}( \theta_{0}). \end{aligned}
(4.17)

Note that

\begin{aligned} \Biggl|\frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0})\Biggr|\leq\Biggl\| \frac{1}{n} \sum_{t=1}^{n} g_{t}^{\tau}( \theta_{0})\Biggr\| =O_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}
(4.18)

By using LemmaÂ 4.1 and (4.18), we can obtain

\begin{aligned} \max_{1\leq t \leq n}\bigl\| g_{t}( \theta_{0})\bigr\| \frac{\beta_{0}^{\tau}}{n}\sum_{t=1}^{n} g_{t}^{\tau}(\theta_{0})=o_{p}(1). \end{aligned}
(4.19)

By (4.3), we have

\begin{aligned} \beta_{0}^{\tau}\Sigma_{n}( \theta_{0})\beta_{0}\stackrel{p}{\longrightarrow}\beta _{0}^{\tau}\Sigma(\theta_{0})\beta_{0}. \end{aligned}
(4.20)

By this, together with (4.17)-(4.19), we can prove LemmaÂ 4.3.â€ƒâ–¡

### Proof of TheoremÂ 2.1

By (2.3), we have

\begin{aligned} \lambda_{\theta_{0}}= \bigl(\Sigma_{n}( \theta_{0}) \bigr)^{-1}\frac{1}{n}\sum _{t=1}^{n} g_{t}(\theta_{0})+ \bigl(\Sigma_{n}(\theta_{0}) \bigr)^{-1}R_{n}( \theta_{0}), \end{aligned}
(4.21)

where

$$R_{n}(\theta_{0})=\frac{1}{n}\sum _{t=1}^{n} g_{t}^{\tau}( \theta_{0})\frac {(\lambda^{\tau}_{\theta_{0}}g_{t}(\theta_{0}))^{2}}{1+\lambda^{\tau}_{\theta _{0}}g_{t}(\theta_{0})}.$$

By Lemmas 4.1-4.3, we know that

\begin{aligned} R_{n}(\theta_{0})=o_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}
(4.22)

This implies uniformly for t

\begin{aligned} \omega_{t}(\theta_{0}) =&\frac{1}{n} \frac{1}{1+\lambda_{\theta _{0}}^{\tau}g_{t}(\theta_{0})} \\ =&\frac{1}{n} \bigl( 1- \lambda_{\theta_{0}}^{\tau}g_{t}(\theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr). \end{aligned}
(4.23)

Moreover, note that

$$\sqrt{n} \bigl(\hat{\alpha}-\alpha^{0} \bigr)=T_{n}^{-1}S_{n},$$
(4.24)

where $$T_{n}=\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}Z^{\tau}_{t}$$ and $$S_{n}=\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}(X_{t}-Z^{\tau}_{T}\alpha^{0})$$.

First, we consider $$T_{n}$$. By (4.23), we have

\begin{aligned} T_{n} =&\sum_{t=1}^{n} \omega_{t}({\theta_{0}})Z_{t}Z^{\tau}_{t} \\ =&\frac{1}{n}\sum_{t=1}^{n} \bigl( 1- \lambda_{\theta_{0}}^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr)Z_{t}Z^{\tau}_{t} \\ =&\frac{1}{n}\sum_{t=1}^{n}Z_{t}Z^{\tau}_{t} -\frac{1}{n}\sum_{t=1}^{n} \bigl(Z_{t}Z^{\tau}_{t} \bigr)\otimes \bigl( \bigl( \bigl(\Sigma_{n}(\theta _{0}) \bigr)^{-1}R_{n}( \theta_{0}) \bigr)^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr) \\ &{}-\frac{1}{n}\sum_{t=1}^{n} \bigl(Z_{t}Z^{\tau}_{t} \bigr)\otimes \Biggl( \Biggl( \bigl(\Sigma _{n}(\theta_{0}) \bigr)^{-1} \frac{1}{n}\sum_{t=1}^{n} g_{t}(\theta_{0}) \Biggr)^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \Biggr) \\ \doteq&U_{1}-U_{2}-U_{3}. \end{aligned}

Then by (4.3) and (4.22), conditions (A1) and (A2), LemmaÂ 4.2, and the ergodic theorem, we can prove that

$$U_{2}=o_{p}(1).$$
(4.25)

Similarly, we can obtain

$$U_{3}\stackrel{\mathrm{a.s.}}{\longrightarrow}0.$$
(4.26)

This, together with (4.25), yields

$$T_{n}\stackrel{p}{\longrightarrow}W.$$
(4.27)

Next consider $$S_{n}$$. By (4.23), we have

\begin{aligned} S_{n} =&\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\theta _{0}})Z_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr) \\ =&\frac{1}{\sqrt{n}}\sum_{t=1}^{n} \bigl( 1- \lambda_{\theta _{0}}^{\tau}g_{t}( \theta_{0}) \bigl(1+o_{p}(1) \bigr) \bigr)Z_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr). \end{aligned}

Thus, combining with (4.21), (4.22), we can obtain

\begin{aligned} S_{n} =&\frac{1}{\sqrt{n}}\sum _{t=1}^{n}Z_{t} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr) \\ &{}-\frac{1}{n}\sum_{t=1}^{n} \bigl(X_{t}-Z^{\tau}_{t}\alpha ^{0} \bigr)Z_{t}g_{t}^{\tau}(\theta_{0}) \bigl( \Sigma_{n}(\theta_{0}) \bigr)^{-1} \frac{1}{\sqrt{n}}\sum_{t=1}^{n} g_{t}(\theta_{0})+o_{p}(1). \end{aligned}

By the ergodic theorem, we have

$$\frac{1}{n}\sum_{t=1}^{n} \bigl(X_{t}-Z^{\tau}_{t}\alpha^{0} \bigr)Z_{t}g_{t}^{\tau}(\theta _{0}) \stackrel{\mathrm{a.s.}}{\longrightarrow} \Lambda_{12}.$$

This, together with (4.3) and LemmaÂ 4.2, proves that

\begin{aligned} S_{n}\stackrel{d}{\longrightarrow}N \bigl(0,\Lambda- \Lambda_{12}\Sigma ^{-1}\Lambda^{\tau}_{12} \bigr), \end{aligned}
(4.28)

which, combining with (4.27), proves TheoremÂ 2.1.â€ƒâ–¡

### Proof of TheoremÂ 2.2

Similar to the proof of LemmaÂ 1 and TheoremÂ 1 in Qin and Lawless [16], we can prove that

\begin{aligned} \lambda_{\hat{\theta}}=B\frac{1}{n}\sum _{t=1}^{n}g_{t}(\theta _{0})+o_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}
(4.29)

Moreover, note that

$$\sqrt{n} \bigl(\hat{\alpha}^{1}-\alpha^{0} \bigr)=\tilde{T}_{n}^{-1}\tilde{S}_{n},$$
(4.30)

where

$$\tilde{T}_{n}=\sum_{t=1}^{n} \omega_{t}({\hat{\theta}})Z_{t}Z^{\tau}_{t}$$

and

$$\tilde{S}_{n}=\sqrt{n}\sum_{t=1}^{n} \omega_{t}({\hat{\theta }})Z_{t} \bigl(X_{t}-Z^{\tau}_{t} \alpha^{0} \bigr).$$

By an argument similar to the proof of TheoremÂ 2.1, we can prove that

$$\tilde{T}_{n}\stackrel{p}{\longrightarrow}W$$
(4.31)

and

$$\tilde{S}_{n}\stackrel{d}{\longrightarrow}N \bigl(0, \Lambda-\Lambda_{12}B\Lambda ^{\tau}_{12} \bigr),$$
(4.32)

showing (2.7). The proof of TheoremÂ 2.2 is thus completed.â€ƒâ–¡

## References

1. Alzaid, AA, Al-Osh, M: An integer-valued pth-order autoregressive structure (INAR(p)) process. J. Appl. Probab. 27, 314-324 (1990)

2. Davis, RA, Dunsmuir, WTM, Streett, SB: Observation-driven models for Poisson counts. Biometrika 90, 777-790 (2003)

3. Zheng, H, Basawa, IV, Datta, S: Inference for pth-order random coefficient integer-valued autoregressive processes. J.Â Time Ser. Anal. 27, 411-440 (2006)

4. WeiÃŸ, CH: Thinning operations for modeling time series of counts - a survey. AStA Adv. Stat. Anal. 92, 319-341 (2008)

5. Ferland, R, Latour, A, Oraichi, D: Integer-valued GARCH process. J. Time Ser. Anal. 27, 923-942 (2006)

6. Zhu, F, Wang, D: Estimation and testing for a Poisson autoregressive model. Metrika 73, 211-230 (2011)

7. Hansen, LP: Large sample properties of generalized method of moments estimators. Econometrica 50, 1029-1054 (1982)

8. Isaki, CT: Variance estimation using auxiliary information. J. Am. Stat. Assoc. 78, 117-123 (1983)

9. Kuk, AYC, Mak, TK: Median estimation in the presence of auxiliary information. J. R. Stat. Soc. B 51, 261-269 (1989)

10. Rao, JNK, Kovar, JG, Mantel, HJ: On estimating distribution functions and quantiles from survey data using auxiliary information. Biometrika 77, 365-375 (1990)

11. Owen, AB: Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237-249 (1988)

12. Owen, AB: Empirical likelihood and confidence region. Ann. Stat. 18, 90-120 (1990)

13. Chen, J, Qin, J: Empirical likelihood estimation for finite populations and the effective usage of auxiliary information. Biometrika 80, 107-116 (1993)

14. Zhang, B: M-Estimation and quantile estimation in the presence of auxiliary information. J. Stat. Plan. Inference 44, 77-94 (1995)

15. Tang, CY, Leng, C: An empirical likelihood approach to quantile regression with auxiliary information. Stat. Probab. Lett. 82, 29-36 (2012)

16. Qin, J, Lawless, J: Empirical likelihood and general estimating equations. Ann. Stat. 22, 300-325 (1994)

17. Owen, AB: Empirical Likelihood. Chapman & Hall/CRC, London (2001)

18. Hall, P, Heyde, CC: Martingale Limit Theory and Its Application. Academic Press, New York (1980)

## Acknowledgements

This work is supported by National Natural Science Foundation of China (Nos. 11271155, 11001105, 11071126, 10926156, 11071269), Specialized Research Fund for the Doctoral Program of Higher Education (Nos. 20110061110003, 20090061120037), Scientific Research Fund of Jilin University (Nos. 201100011, 200903278), the Science and Technology Development Program of Jilin Province (201201082), Jilin Province Social Science Fund (2012B115), and Jilin Province Natural Science Foundation (20101596, 20130101066JC).

## Author information

Authors

### Corresponding author

Correspondence to Dehui Wang.

### Competing interests

The authors declare that they have no competing interests.

### Authorsâ€™ contributions

The authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

## Rights and permissions

Reprints and permissions

Peng, C., Wang, D. & Zhao, Z. Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions. J Inequal Appl 2015, 218 (2015). https://doi.org/10.1186/s13660-015-0725-1