# Statistical inference for the new INAR(2) models with random coefficient

## Abstract

In this paper, we investigate a random coefficient INAR(2) process which may model the number of traded stocks, the number of infected people, the number of birds in some area, etc. We show that this process is a stationary and ergodic process under some mild conditions. Adopting the two-step conditional least-square estimation method, we give consistent estimations of the unknown parameters. Furthermore, the asymptotic distributions of the estimators are obtained and a simulation study is conducted for the evaluation of the developed approach.

## Introduction

Integer-valued time series data have been studied a lot in the past three decades because its many applications in different fields. The integer-valued autoregressive models (INARs) defined through the thinning operator are the most popular model for describing such count data and have been extensively investigated by McKenzie , and investigated in detail by Al-Osh and Alzaid , Alzaid and Al-Osh , among others.

The classical INAR(1) model is defined as

\begin{aligned} X_{t}=\alpha \circ X_{t-1}+\epsilon _{t},\quad t\in \mathbf{Z}, \end{aligned}
(1.1)

where $$\alpha \in (0,1)$$ is a constant, $$\alpha \circ X_{t-1}=\sum_{i=1}^{X_{t-1}}B_{i}$$, $$\{B_{i}\}$$ is an i.i.d. Bernoulli random sequence with $$P(B_{i}=1)=1-P(B_{i}=0)=\alpha$$, and is independent of $$\{X_{t-1}\}$$, $$\{\epsilon _{t}\}$$ is a sequence of i.i.d. nonnegative integer-valued random variables with mean λ and variance $$\sigma ^{2}_{\epsilon }$$, and is independent of $$\{X_{t-1}\}$$. Zheng et al.  extend the INAR(1) model to the random coefficient INAR(1) model, i.e. suppose that a random variable with cumulative distribution function (CDF) $$P_{\phi }$$ on $$(0, 1)$$. Since then, there were many authors to consider the INAR models with random coefficient. For example, Zheng et al.  consider statistical inference for the INAR(p) model with random coefficient, Zhang et al. [6, 7] investigate the INAR(1) and INAR(p) models using empirical likelihood method, Zhang and Wang  obtain some inference for random coefficient INAR(1) process based on frequency domain analysis; Nedényi and Pap  establish the iterated scaling limits for the aggregation of random coefficient INAR(1) processes, Ding and Wang  suppose that the random coefficient is incorporate with explanatory variables, Nastić and Ristić ) introduce some geometric mixed INAR models.

In this paper we will study the new INAR(2) model with random coefficient (NINAR(2)) which is defined as follows:

\begin{aligned} X_{t}= \textstyle\begin{cases} \alpha _{1}\circ X_{t-1}+\epsilon _{t} & \text{with probability p_{1};} \\ \alpha _{2}\circ X_{t-2}+\epsilon _{t} & \text{with probability p_{2};} \\ \epsilon _{t} & \text{with probability 1-p_{1}-p_{2},} \end{cases}\displaystyle \end{aligned}
(1.2)

where $$\alpha _{1},\alpha _{2}\in (0,1)$$ are constant, $$\{\epsilon _{t} \}$$ is a sequence of i.i.d. nonnegative integer-valued random variables with mean λ and variance $$\sigma ^{2}_{\epsilon }$$, and it is independent of $$\{X_{t-1}\}$$. It is easy to check that $$\{X_{t}\}$$ can be rewritten as

\begin{aligned} X_{t}=\theta _{1}\circ X_{t-1}+\theta _{2}\circ X_{t-2}+\epsilon _{t}, \end{aligned}

where the random vector $$(\theta _{1},\theta _{2})$$ is i.i.d. for different t and we have the joint distribution given by

\begin{aligned} \begin{aligned} &P(\theta _{1}=\alpha _{1},\theta _{2}=\alpha _{2})=0,\qquad P(\theta _{1}=\alpha _{1},\theta _{2}=0)=p_{1}, \\ &P(\theta _{1}=0,\theta _{2}=\alpha _{2})=p_{2},\qquad P(\theta _{1}=0,\theta _{2}=0)=1-p_{1}-p_{2}. \end{aligned} \end{aligned}
(1.3)

Here $$p_{1} + p_{2} < 1$$. That is why we call this model the INAR(2) model with random coefficient.

Lawrance and Lewis  investigate NEAR(2) models which are the nonlinear autoregressive time series in exponential variables, Dewald and Lewis  study the new Laplace second order autoregressive time series model, i.e. the NLAR(2) model, later, Karlsen and Tjøstheim  give the consistent estimates of the unknown parameters of the NEAR(2) and the NLAR(2) models. Inspired by the study of them we consider the NINAR(2) model defined by (1.2). There are many results about the inference of INAR(p) model with random coefficients; see  and  for more details. In general, one may assume that all the random coefficients are independent random variables. In this paper, we allow the dependence between random coefficients. Therefore, this model can be applied when we consider the dependence between random coefficients. Furthermore, the advantage of the proposed model over the general random coefficient INAR(2) model is that the unknown parameters can be estimated directly, therefore it can be applied easily.

The outline of this paper is as follows. In Sect. 2, we investigate the stationary and ergodic properties, present the estimations of parameters and give their asymptotic properties. In Sect. 3, we present some simulation results. In Sect. 4, we make conclusions. All proofs are postponed to Sect. 5.

## Main results

The stationarity and ergodicity are important for time series, therefore we first consider the existence of unique stationary and ergodic solution of the random coefficient INAR(2) given in (1.2).

### Theorem 2.1

Suppose that $$\alpha _{1}, \alpha _{2} \in (0,1)$$ and $$p_{1}, p_{2} \in (0,1)$$, then there exists a strict stationary and ergodic integer-value random series satisfying model (1.2).

Use the stationarity and ergodicity properties of the process, we can obtain the estimation of the unknown parameters. We use two-step condition least-square estimation to estimate the unknown parameters. Let $$\mathcal{F}_{t}$$ be the σ-field generated by $$\{X_{s}, s \leq t\}$$. Note that $$E[X_{t}|\mathcal{F}_{t-1}]=p_{1}\alpha _{1}X_{t-1}+p _{2}\alpha _{2}X_{t-2}+\lambda =\beta _{1}X_{t-1}+\beta _{2}X_{t-2}+ \lambda$$ with $$\beta _{1}=p_{1}\alpha _{1}, \beta _{2}=p_{2}\alpha _{2}$$. Use

\begin{aligned} S(\eta )=\sum_{t=3}^{n} (X_{t}-\beta _{1}X_{t-1}-\beta _{2}X_{t-2}- \lambda )^{2}, \end{aligned}
(2.1)

denote the CLS criterion function, where $$\eta =(\beta _{1},\beta _{2}, \lambda )^{T}$$. Then the CLS estimator of η is given by

$${\hat{\eta }}_{\mathrm{CLS}}=\arg \min_{\eta } S(\eta ).$$

Denote $$E_{n}(\eta )=(E_{n1}(\eta ),E_{n1}(\eta ),E_{n1}(\eta ))$$ the derivatives of $$\partial S(\eta )/\partial \eta$$, that is, $$E_{n}(\eta )=\partial S(\eta )/\partial \eta$$. By solving the equations $$E_{n}(\eta )=0$$, i.e.,

\begin{aligned} \begin{aligned} &E_{n1}(\eta )=\frac{\partial S(\eta )}{\partial \beta _{1}}=\sum _{t=3} ^{n} (X_{t}-\beta _{1}X_{t-1}-\beta _{2}X_{t-2}-\lambda ) X _{t-1}=0, \\ &E_{n2}(\eta )=\frac{\partial S(\eta )}{\partial \beta _{2}}=\sum_{t=3} ^{n} (X_{t}-\beta _{1}X_{t-1}-\beta _{2}X_{t-2}-\lambda )X_{t-2}=0, \\ &E_{n3}(\eta )=\frac{\partial S(\eta )}{\partial \lambda }=\sum_{t=3} ^{n} (X_{t}-\beta _{1}X_{t-1}-\beta _{2}X_{t-2}-\lambda )=0, \end{aligned} \end{aligned}
(2.2)

we obtain the estimator of η, which is as follows:

\begin{aligned} \hat{\eta }=M^{-1}b, \end{aligned}

where $$b=(\sum_{t=3}^{n}X_{t}X_{t-1}, \sum_{t=3}^{n}X_{t}X_{t-2}, \sum_{t=3}^{n}X_{t})^{T}$$ and

$M=\left(\begin{array}{ccc}{\sum }_{t=3}^{n}{X}_{t-1}^{2}& {\sum }_{t=3}^{n}{X}_{t-1}{X}_{t-2}& {\sum }_{t=3}^{n}{X}_{t-1}\\ {\sum }_{t=3}^{n}{X}_{t-1}{X}_{t-2}& {\sum }_{t=3}^{n}{X}_{t-2}^{2}& {\sum }_{t=3}^{n}{X}_{t-2}\\ {\sum }_{t=3}^{n}{X}_{t-1}& {\sum }_{t=3}^{n}{X}_{t-2}& n-2\end{array}\right).$

In order to estimate the parameters $$\alpha _{1},\alpha _{2},p_{1},p _{2}$$, we consider the conditional least-square estimation of the process $$V_{t}=(X_{t}-E(X_{t}|\mathcal{F}_{t-1}))^{2}$$. It is easy to verify that

\begin{aligned} &E(V_{t}|\mathcal{F}_{t-1}) \\ &\quad=E\bigl(X_{t}^{2}|\mathcal{F}_{t-1}\bigr)- \bigl(E(X_{t}|\mathcal{F}_{t-1})\bigr)^{2} \\ &\quad =\bigl(\alpha _{1}\beta _{1}-\beta _{1}^{2} \bigr)X_{t-1}^{2}+\bigl(\alpha _{2}\beta _{2}-\beta _{2}^{2}\bigr)X_{t-2}^{2}+( \beta _{1}-\alpha _{1}\beta _{1})X_{t-1} +(\beta _{2}-\alpha _{2}\beta _{2})X_{t-2} \\ &\qquad{} -2\beta _{1}\beta _{2}X_{t-1}X_{t-2}+ \sigma _{\epsilon }^{2}. \end{aligned}

Then the CLS (conditional least-square) criterion function for $$\theta =(\alpha _{1}\beta _{1}-\beta _{1}^{2},\alpha _{2}\beta _{2}-\beta _{2}^{2}, \beta _{1}-\alpha _{1}\beta _{1},\beta _{2}-\alpha _{2}\beta _{2},2 \beta _{1}\beta _{2},\sigma _{\epsilon }^{2})^{T}$$ is given by

\begin{aligned} S(\theta ) =\sum_{t=3}^{n} \bigl(V_{t}-E(V_{t}|\mathcal{F}_{t-1}) \bigr)^{2}. \end{aligned}

The CLS estimator of θ is given by

$${\hat{\theta }}_{\mathrm{CLS}}=\arg \min_{\theta } S(\theta ).$$

Let $$Z_{t}=(X_{t-1}^{2},X_{t-2}^{2},X_{t-1},X_{t-2},-X_{t-1}X_{t-2},1)^{T}$$, thus by solving the equations $$\partial S(\theta )/\partial \theta =0$$, we obtain the estimator of θ, which is as follows:

\begin{aligned} \hat{\theta }(\eta )= \Biggl(\frac{1}{n-2}\sum_{t=3}^{n}Z_{t}Z_{t}^{T} \Biggr)^{-1} \Biggl(\frac{1}{n-2}\sum_{t=3}^{n}V_{t}Z_{t} \Biggr). \end{aligned}

Let $$\bar{\theta }(\hat{\eta })$$ be the estimator $$\hat{\theta }( \eta )$$ with η replaced by η̂. Define $$\hat{\eta } _{1}, \hat{\eta }_{2}$$ as components of η̂ and $$\bar{ \theta }_{1}(\hat{\eta }), \bar{\theta }_{2}(\hat{\eta })$$ as components of $$\bar{\theta }(\hat{\eta })$$. We obtain the estimators of $$\alpha _{1},\alpha _{2},p_{1},p_{2}$$ as follows:

\begin{aligned} \hat{\alpha }_{1}=\frac{\bar{\theta }_{1}(\hat{\eta })+\hat{\eta } _{1}^{2}}{\hat{\eta }_{1}},\qquad \hat{\alpha }_{2}= \frac{\bar{\theta } _{2}(\hat{\eta })+\hat{\eta }_{2}^{2}}{\hat{\eta }_{2}},\qquad \hat{p}_{1}=\frac{ \hat{\eta }^{2}_{1}}{\bar{\theta }_{1}(\hat{\eta })+\hat{\eta }_{1} ^{2}},\qquad \hat{p}_{2}=\frac{\hat{\eta }^{2}_{2}}{\bar{\theta }_{2}( \hat{\eta })+\hat{\eta }_{2}^{2}}. \end{aligned}

About the consistency and asymptotic property of the estimators, we have the following theorems.

### Theorem 2.2

Assume that the process $$\{X_{t}\}$$ is a stationary ergodic process and $$E|X_{t}|^{8}<\infty$$, then we see that $$(\sqrt{n}(\bar{\theta }( \hat{\eta })-\theta ),\sqrt{n}(\hat{\eta }-\eta ))$$ converges to a normal distribution with mean zero and covariance matrix

$\Omega ={\left({\omega }_{ij}\right)}_{9×9}=\left(\begin{array}{cc}{\Gamma }^{-1}W{\Gamma }^{-1}& {\Gamma }^{-1}\Pi {V}^{-1}\\ {V}^{-1}\Pi {\Gamma }^{-1}& {V}^{-1}\Sigma {V}^{-1}\end{array}\right),$

where $$V=\lim_{n\rightarrow \infty }(1/n)M, \varSigma =E(X_{t}-\eta ^{T}D _{t})^{2}D_{t}D_{t}^{T}, \varGamma =EZ_{t}Z_{t}^{T}$$, $$W=E((V_{t}-Z_{t} ^{T}\theta )^{2}Z_{t}Z_{t}^{T}), \varPi =E((V_{t}-Z_{t}^{T}\theta )(X_{t}- \eta ^{T}D_{t})Z_{t}D_{t}^{T})$$, $$D_{t}=(X_{t-1},X_{t-2},1)^{T}$$.

### Theorem 2.3

Assume that the process $$\{X_{t}\}$$ is a stationary ergodic process and $$E|X_{t}|^{8}<\infty$$, then the estimators $$\hat{\alpha }_{1}, \hat{\alpha }_{2},\hat{p}_{1},\hat{p}_{1}$$ are consistent estimators and have an asymptotic normal distribution with mean zero and variance given by (5.2) and (5.3), respectively.

If $$\alpha _{2}=0$$, the model (1.2) is a specific first order INAR model with random coefficient. Zhao and Hu  give the estimators of the unknown parameters by using the least-square method. Therefore, we need to consider the following hypotheses:

$$H_{0}: \alpha _{2}=0\quad \text{vs.} \quad H_{1}: \alpha _{2}>0.$$

Based on the asymptotical normality of the $$\hat{\alpha }_{2}$$ given by Theorem 2.3, we know that $$\sqrt{n}(\hat{\alpha }_{2}-\alpha _{2})/\operatorname{Var}(\hat{\alpha }_{2})$$ converges weakly to $$N(0,1)$$, where $$\operatorname{Var}(\hat{\alpha }_{2})$$ is given by (5.2). Therefore, we need to find an estimator of $$\operatorname{Var}(\hat{\alpha }_{2})$$. By Theorem 2.3, we know that

$$\operatorname{Var}(\hat{\alpha }_{2})=\frac{\beta _{2}^{2}\omega _{22}+(\beta _{2}^{2}- \theta _{2})^{2}\omega _{88}+\beta _{2}(\beta _{2}^{2}-\theta _{2})\omega _{28}}{\beta _{2}^{4}}.$$

In order to test this hypothesis, we need to estimate the unknown parameters in the above variance. Based on Theorem 2.3, we know that the estimators of $$\beta _{2}, \theta _{2}$$ can be given by $$\hat{\beta }_{2}=\hat{\eta }_{2}, \bar{\theta }_{2}(\hat{\eta })$$. From the stationary and ergodic properties, we can use the estimators

\begin{aligned} \hat{V}=\frac{1}{n-2}M,\qquad \hat{\varGamma }=\frac{1}{n-2}\sum _{t=3}^{n} Z_{t}Z_{t}^{T} \end{aligned}
(2.3)

to estimate $$V, \varGamma$$. We use the following estimators to estimate $$\varSigma , W, \varPi$$, respectively:

\begin{aligned} &\hat{\varSigma } =\frac{1}{n-2}\sum_{t=3}^{n} \bigl(X_{t}-\hat{\eta }^{T}D _{t} \bigr)^{2}D_{t}D_{t}^{T},\\ & \hat{W}= \frac{1}{n-2}\sum_{t=3}^{n} \bigl(V_{t}(\hat{\eta })-Z_{t}^{T}\bar{ \theta }_{2}(\hat{\eta })\bigr)^{2}Z_{t}Z_{t}^{T}, \\ &\hat{\varPi } =\frac{1}{n-2}\sum_{t=3}^{n} \bigl(X_{t}-\hat{\eta }^{T}D_{t}\bigr) \bigl(V _{t}(\hat{\eta })-Z_{t}^{T}\bar{\theta }_{2}(\hat{\eta })\bigr)Z_{t}D_{t} ^{T}). \end{aligned}

### Corollary 2.4

Under the condition of Theorem 2.2, we conclude that

$$\hat{\varSigma }\stackrel{P}{\longrightarrow }\varSigma ,\qquad \hat{W} \stackrel{P}{ \longrightarrow }W, \qquad \hat{\varPi }\stackrel{P}{\longrightarrow }\varPi.$$

Thus we use the following statistic to test $$H_{0}$$:

\begin{aligned} \frac{\sqrt{n}(\hat{\alpha }_{2}-\alpha _{2})}{\varUpsilon }, \quad\text{here } \varUpsilon =\frac{\hat{\beta }_{2}^{2}\hat{\omega }_{22}+( \hat{\beta }_{2}^{2}-\hat{\theta }_{2})^{2}\hat{\omega }_{88} + \hat{\beta }_{2}(\hat{\beta }_{2}^{2}-\hat{\theta }_{2})\hat{\omega } _{28}}{\hat{\beta }_{2}^{4}}. \end{aligned}

Next we consider the one-step conditional expectation prediction of this process. Note that $$E[X_{t}|\mathcal{F}_{t-1}]=\beta _{1}X_{t-1}+\beta _{2}X_{t-2}+\lambda$$, we can use

$$\hat{X}_{t-1}=\hat{\beta }_{1}X_{t-1}+\hat{\beta }_{2}X_{t-2}+ \hat{\lambda }$$

as the prediction value of $$X_{t}$$. From the asymptotic normality given in Theorem 2.2, we know $$\sqrt{n}(\hat{\eta }-\eta )\stackrel{d}{ \longrightarrow }N(0,V^{-1}\varSigma V^{-1})$$. Thus we have

$$\sqrt{n}\bigl(\hat{X}_{t-1}-E[X_{t}|\mathcal{F}_{t-1}] \bigr)|X_{t-1},X_{t-2}\stackrel{d}{ \longrightarrow }N \bigl(0,A_{t}^{T} V^{-1}\varSigma V^{-1} A_{t}\bigr),$$

where $$A_{t}^{T}=(X_{t-1},X_{t-2},1)$$. Then, by (2.3), we can obtain the confidence interval for the prediction value of $$X_{t}$$,

$$\biggl[\hat{X}_{t-1}-\sqrt{\frac{A_{t}^{T} \hat{V}^{-1}\hat{\varSigma } \hat{V}^{-1} A_{t}}{n}}u_{\frac{\nu }{2}}, \hat{X}_{t-1}+\sqrt{\frac{A _{t}^{T} \hat{V}^{-1}\hat{\varSigma } \hat{V}^{-1} A_{t}}{n}u_{\frac{ \nu }{2}}} \biggr],$$

where $$u_{\nu /2}$$ is the upper ν-quantile of the standard normal distribution.

## Simulation studies

In this section we present some simulation study.

### Empirical results for unknown parameters

We consider the following two models:

1. Model I:

$$\alpha _{1}=0.6$$, $$\alpha _{2}=0.8$$, $$p_{1}=0.4$$, $$p_{2}=0.5$$ with $$\lambda =1$$ and $$\lambda =2$$.

2. Model II:

$$\alpha _{1}=0.7$$, $$\alpha _{2}=0.5$$, $$p_{1}=0.3$$, $$p_{2}=0.6$$ with $$\lambda =1$$ and $$\lambda =2$$.

For both models, we obtain the empirical bias (Bias) and the standard error (SE) based on 500 replications for each parameter combination. These simulation studies are given in Table 1, 2, 3 and 4, respectively, where the format (Bias, SE) is used; for example, ($$-0.0213 0.0389$$) means that the bias is −0.0213, and SE is 0.0389.

From the simulation results, we can see that the bias and standard errors are getting smaller when the sample size increasing. For smaller sample size, the standard errors are a little bigger, this may be because the true values of parameters are small and may disappear in their own stand error.

In order to obtain a visualization of some of the distributional properties, we present the Box plot of estimated parameters in Figs. 1 and 2.

### Test for parameters

In this subsection, we consider to test the following hypotheses for Model I and II, given in Sect. 3.1:

$$H_{0}: \alpha _{2}=0\quad \text{vs.}\quad H_{1}: \alpha _{2}>0.$$

We report the empirical sizes for Model III and IV at a significance level 0.05 with sample size $$n = 50, 100, 200,500$$, respectively:

1. Model III:

$$\alpha _{1}=0.6$$, $$\alpha _{2}=0$$, $$p_{1}=0.2$$, $$p_{2}=0.7$$ with $$\lambda =1$$ and $$\lambda =2$$.

2. Model IV:

$$\alpha _{1}=0.3$$, $$\alpha _{2}=0$$, $$p_{1}=0.2$$, $$p_{2}=0.7$$ with $$\lambda =1$$ and $$\lambda =2$$.

The results are presented in Table 5. From Table 5, we can see that the empirical sizes is closed to 0.05 when n increases.

In order to investigate the power of the test, we consider the alternative hypothesis with parameter $$\alpha _{2}=0.2,0.6,0.8$$ for Model III and IV, respectively. We report the empirical power at a significance level 0.05 with sample size $$n=50,100,200,500$$. The simulation results are given by Tables 69. From these tables, we can see that the power increases monotonically when the parameter $$\alpha _{2}$$ increases.

## Conclusion

In this paper we study the new INAR(2) model with random coefficient, this model is more practical then the classical INAR(2) model with random coefficient because the parameters can by estimated by the usual estimation method while for the classical INAR(2) model with random coefficient only the mean of the random coefficient can be estimated. We use the two-step conditional least squares estimate to estimate the unknown parameters. Also the stationary and ergodic properties of this model are established which guarantee the estimation method can be applied. The asymptotic properties of the estimators are investigated. The efficiency of our estimation method is illustrated by the simulation study.

## The proofs of main results

### Proof of Theorem 2.1

Define a random sequence $$\{X_{t}^{(n)}\}_{n\in \mathbf{Z}}$$ as follows:

\begin{aligned} X_{t}^{(n)}=\textstyle\begin{cases} 0, & \text{n< 0;} \\ \epsilon _{t}, & \text{n=0;} \\ \theta _{1}\circ X_{t-1}^{n-1}+\theta _{2}\circ X_{t-2}^{n-1}+\epsilon _{t}, & \text{n>0.} \end{cases}\displaystyle \end{aligned}
(5.1)

The random vector $$(\theta _{1},\theta _{2})$$ has a joint distribution given by (1.3) and is independent of $$\{\epsilon _{t}\}$$, the random sequences used in the operator $$\theta _{1}\circ$$ and $$\theta _{1}\circ$$ are the same for fixed t. We first prove that the first two moments of $$\{X_{t}^{(n)}\}$$ are finite. It is easy to verify that

\begin{aligned} EX_{t}^{(0)}=\lambda ,\qquad EX_{t}^{(1)}=( \beta _{1}+\beta _{2})\lambda +\lambda , \end{aligned}

where $$\beta _{1}=\alpha _{1}p_{1}, \beta _{2}=\alpha _{2}p_{2}$$. Using the method of induction, we conclude that

\begin{aligned} EX_{t}^{(n)}=\sum_{i=0}^{n}( \beta _{1}+\beta _{2})^{i}\lambda < \infty. \end{aligned}

We have

\begin{aligned} E\bigl(X_{t}^{(0)}\bigr)^{2}&=E\epsilon _{t}^{2}, E\bigl(X_{t}^{(1)} \bigr)^{2}\\ &=(\alpha _{1} \beta _{1}+\alpha _{2}\beta _{2})E\epsilon _{t}^{2}+E \epsilon _{t}^{2}+ \lambda \bigl[\beta _{1}(1- \alpha _{1})+\beta _{2}(1-\alpha _{2})\bigr] +2 \lambda ^{2}(\beta _{1}+\beta _{2}). \end{aligned}

For convenience, let $$A=\beta _{1}(1-\alpha _{1})+\beta _{2}(1-\alpha _{2})+2\lambda (\beta _{1}+\beta _{2})]$$. Then using the method of induction, we conclude that

\begin{aligned} E\bigl(X_{t}^{(n)}\bigr)^{2}=E\epsilon _{t}^{2}\sum_{i=0}^{n}( \alpha _{1}\beta _{1}+\alpha _{2}\beta _{2})^{i}+ A\lambda \Biggl(\sum _{k=1}^{n-1}(\alpha _{1}\beta _{1}+\alpha _{2}\beta _{2})^{n-k}\sum _{i=0}^{k-1}(\beta _{1}+ \beta _{2})^{i} \Biggr)< \infty. \end{aligned}

The last inequality is obtained by the fact that

\begin{aligned} \alpha _{1}\beta _{1}+\alpha _{2}\beta _{2}=\alpha _{1}^{2} p_{1}+\alpha _{2}^{2} p_{2}< p_{1}+p_{1}< 1,\qquad \beta _{1}+\beta _{2}=\alpha _{1} p_{1}+\alpha _{2} p_{2}< p_{1}+p_{1}< 1. \end{aligned}

Next we consider the convergence of the sequence $$\{X_{t}^{(n)}\}$$. By the definition of the sequence $$\{X_{t}^{n}\}$$, we have

\begin{aligned} &E \bigl\vert X_{t}^{(n)}-X_{t}^{(n-1)} \bigr\vert \\ &\quad =E \bigl\vert \theta _{1}\circ X_{t-1}^{(n-1)}+ \theta _{2}\circ X_{t-2}^{(n-1)}-\theta _{1} \circ X_{t-1}^{(n-2)}-\theta _{2}\circ X_{t-2}^{(n-2)} \bigr\vert \\ &\quad \leq E \bigl\vert \theta _{1}\circ X_{t-1}^{(n-1)}- \theta _{1}\circ X_{t-1}^{(n-2)} \bigr\vert +E \bigl\vert \theta _{2}\circ X_{t-2}^{(n-1)}-\theta _{2}\circ X_{t-2}^{(n-2)} \bigr\vert \\ &\quad =\beta _{1}E \bigl\vert X_{t-1}^{(n-1)}-X_{t-1}^{(n-2)} \bigr\vert +\beta _{2}E \bigl\vert X_{t-2}^{(n-1)}-X _{t-2}^{(n-2)} \bigr\vert , \end{aligned}

where the last equality is because the operator $$\theta _{t1}\circ$$ is the same for fixed t. Repeat the deduction and notice that $$E|X_{t-1}^{(1)}-X_{t-1}^{(0)}|=E\epsilon _{t}$$, we conclude that

\begin{aligned} E \bigl\vert X_{t}^{(n)}-X_{t}^{(n-1)} \bigr\vert \leq (\beta _{1}+\beta _{2})^{n}E \epsilon _{t}. \end{aligned}

Then, by the triangle inequality, we obtain, for any integers $$n>m$$,

\begin{aligned} E \bigl\vert X_{t}^{(n)}-X_{t}^{(m)} \bigr\vert \leq E\epsilon _{t}(n-m)\sum_{j=m+1}^{n}( \beta _{1}+\beta _{2})^{j}, \end{aligned}

which tends to zero as $$n,m\rightarrow \infty$$. Note that

\begin{aligned} &E \bigl\vert X_{t}^{(n)}-X_{t}^{(n-1)} \bigr\vert ^{2} \\ &\quad=E \bigl\vert \theta _{1}\circ X_{t-1}^{(n-1)}+ \theta _{2}\circ X_{t-2}^{(n-1)}- \theta _{1} \circ X_{t-1}^{(n-2)}-\theta _{2}\circ X_{t-2}^{(n-2)} \bigr\vert ^{2} \\ &\quad = E \bigl\vert \theta _{1}\circ X_{t-1}^{(n-1)}- \theta _{1}\circ X_{t-1}^{(n-2)} \bigr\vert ^{2}+E \bigl\vert \theta _{2}\circ X_{t-2}^{(n-1)}- \theta _{2}\circ X_{t-2}^{(n-2)} \bigr\vert ^{2} \\ &\qquad{}+2E \bigl\vert \theta _{1}\circ X_{t-1}^{(n-1)}- \theta _{1}\circ X_{t-1}^{(n-2)} \bigr\vert \bigl\vert \theta _{2}\circ X_{t-2}^{(n-1)}-\theta _{2}\circ X_{t-2}^{(n-2)} \bigr\vert \\ &\quad \leq \bigl[\beta _{1}(1-\alpha _{1})+2\beta _{1}\bigr]E \bigl\vert X_{t-1}^{(n-1)}-X_{t-1} ^{(n-2)} \bigr\vert +\alpha _{1}\beta _{1}E \bigl\vert X_{t-1}^{(n-1)}-X_{t-1}t^{(n-2)} \bigr\vert ^{2} \\ &\qquad{}+\bigl[\beta _{2}(1-\alpha _{2})+2\beta _{2} \bigr]E \bigl\vert X_{t-2}^{(n-1)}-X_{t-2}^{(n-2)} \bigr\vert + \alpha _{2}\beta _{2}E \bigl\vert X_{t-2}^{(n-1)}-X_{t-2}t^{(n-2)} \bigr\vert ^{2} \\ &\quad \leq \cdots \\ &\quad \leq (\alpha _{1}\beta _{1}+\alpha _{2}\beta _{2})^{n}E\epsilon _{t}^{2}+BE \epsilon _{t} \sum_{k=1}^{n-1}(\alpha _{1}\beta _{1}+\alpha _{2}\beta _{2})^{k-1}( \beta _{1}+\beta _{2})^{n-k}, \\ &\quad\leq (\alpha _{1}\beta _{1}+\alpha _{2}\beta _{2})^{n}E\epsilon _{t}^{2}+BE \epsilon _{t} n(\beta _{1}+\beta _{2})^{n}, \end{aligned}

where $$B=\beta _{1}(1-\alpha _{1})+2\beta _{1}+\beta _{2}(1-\alpha _{2})+2 \beta _{2}$$. Then we get, for any integers $$n>m$$,

\begin{aligned} E \bigl\vert X_{t}^{(n)}-X_{t}^{(m)} \bigr\vert ^{2}\leq \bigl[E\epsilon _{t}^{2}+BE \epsilon _{t}\bigr](n-m) \sum_{j=m+1}^{n} \bigl[(\alpha _{1}\beta _{1}+\alpha _{2}\beta _{2})^{j} +j( \beta _{1}+\beta _{2})^{j} \bigr] , \end{aligned}

which tends to zero as $$n,m\rightarrow \infty$$. This implies that $$\{X_{t}^{(n)}\}$$ is a Cauchy sequence. Let $$\{X_{t}\}$$ be the limit process of $$\{X_{t}^{(n)}\}$$, then the first two moments of $$X_{t}$$ exist. Now we verify that $$\{X_{t}\}$$ satisfies (1.2). Since $$X_{t}^{(n)} \stackrel{L_{2}}{\longrightarrow }X_{t}$$, we have, for any t,

\begin{aligned} E \bigl\vert \theta _{1}\circ X_{t}^{(n)}-\theta _{1}\circ X_{t} \bigr\vert ^{2}=\beta _{1}(1- \alpha _{1})E \bigl\vert X_{t}^{(n)}- X_{t} \bigr\vert +\alpha _{1}\beta _{1}E \bigl\vert X_{t}^{(n)}- X _{t} \bigr\vert ^{2}\rightarrow 0. \end{aligned}

Similarly, we can prove that for any t

\begin{aligned} E \bigl\vert \theta _{2}\circ X_{t}^{(n)}-\theta _{2}\circ X_{t} \bigr\vert ^{2}\rightarrow 0. \end{aligned}

By the uniqueness of the convergence in $$L_{2}$$, we conclude that $$\{X_{t}\}$$ satisfies (1.2).

Notice that $$X_{t}^{(1)}=\epsilon _{t}$$, we know that $$\{X_{t}^{(1)}\}$$ is a strict stationary process, then, by the induction method and the definition (5.1), we can see that, for each n, the process $$\{X_{t}^{(n)}\}$$ is strict stationary. The ergodicity can be obtained similarly as in Zhang et al. , we omit the details here. □

### Proof of Theorem 2.2

By the strict stationarity and ergodicity of the process $$\{X_{t}\}$$, adopting the standard martingale central limit theorem, we obtain

\begin{aligned} \sqrt{n}(\hat{\eta }-\eta ) = \biggl(\frac{1}{n}M \biggr)^{-1} \frac{1}{ \sqrt{n}}E_{n}(\eta )\stackrel{d}{\longrightarrow }N \bigl(0,V^{-1}\varSigma V ^{-1}\bigr), \end{aligned}

where $$V=\lim_{n\rightarrow \infty }(1/n)M, E_{n}(\eta )=(E_{n1}( \eta ), E_{n2}(\eta ), E_{n3}(\eta ))^{T}$$, $$\varSigma =(\sigma _{ij})$$ is a symmetric matrix with

\begin{aligned} &\sigma _{11}=E(X_{t}-\beta _{1}X_{t-1}- \beta _{2}X_{t-2}-\lambda )^{2} X_{t-1}^{2},\qquad \sigma _{22}=E(X_{t}-\beta _{1}X_{t-1}- \beta _{2}X_{t-2}- \lambda )^{2} X_{t-2}^{2}, \\ &\sigma _{33}=E(X_{t}-\beta _{1}X_{t-1}- \beta _{2}X_{t-2}-\lambda )^{2}, \qquad\sigma _{21}=E(X_{t}-\beta _{1}X_{t-1}-\beta _{2}X_{t-2}-\lambda )^{2} X _{t-1}X_{t-2}, \\ &\sigma _{23}=E(X_{t}-\beta _{1}X_{t-1}- \beta _{2}X_{t-2}-\lambda )^{2} X_{t-2},\qquad \sigma _{31}=E(X_{t}-\beta _{1}X_{t-1}- \beta _{2}X_{t-2}-\lambda )^{2} X_{t-1}. \end{aligned}

Note that

\begin{aligned} \sqrt{n-2}\bigl(\hat{\theta }(\eta )-\theta \bigr) = \Biggl(\frac{1}{n-2} \sum_{t=3}^{n}Z_{t}Z_{t}^{T} \Biggr)^{-1} \Biggl(\frac{1}{\sqrt{n-2}} \sum _{t=3}^{n}Z_{t}\bigl(V_{t}-Z_{t}^{T} \theta \bigr) \Biggr). \end{aligned}

By simple calculation, we see that $$Z_{t}(V_{t}-Z_{t}^{T}\theta )$$ is a martingale. Then, by the condition $$E|X_{t}|^{8}<\infty$$, stationarity and ergodicity of the process $$\{X_{t}\}$$, using the martingale central limit theorem as above, we have

\begin{aligned} \sqrt{n}\bigl(\hat{\theta }(\eta )-\theta \bigr)\stackrel{d}{\longrightarrow }N \bigl(0,\varGamma ^{-1}W \varGamma \bigr), \end{aligned}

where $$\varGamma =E(Z_{t}Z_{t}^{T}), W=E((V_{t}-Z_{t}^{T}\theta )^{2}Z _{t}Z_{t}^{T})$$. Observe that

\begin{aligned} \sqrt{n-2}\bigl(\bar{\theta }(\hat{\eta })-\theta \bigr)=\sqrt{n-2}\bigl(\bar{ \theta }(\hat{\eta })-\hat{\theta }(\eta )\bigr) +\sqrt{n-2}\bigl( \hat{\theta }( \eta )-\theta \bigr) \end{aligned}

and

\begin{aligned} \sqrt{n}\bigl(\bar{\theta }(\hat{\eta })-\hat{\theta }(\eta )\bigr) = \Biggl( \frac{1}{n}\sum_{t=1}^{n}Z_{t}Z_{t}^{T} \Biggr)^{-1} \Biggl(\frac{1}{ \sqrt{n}} \sum _{t=1}^{n}Z_{t}\bigl(V_{t}(\hat{ \eta })-V_{t}(\eta )\bigr) \Biggr), \end{aligned}

where $$V_{t}(\hat{\eta })$$ is the $$V_{t}(\eta )$$ replaced by η with η̂. By a Taylor expansion, we have

\begin{aligned} V_{t}(\hat{\eta })-V_{t}(\eta )={}& {-}2\bigl(X_{t}-E[X_{t}| \mathcal{F}_{t-1}]\bigr)\bigl[X _{t-1}(\hat{\beta }_{1}-\beta _{1})+X_{t-2}(\hat{\beta }_{2}-\beta _{2}) +(\hat{\lambda }-\lambda ) \bigr]\\ &{}+o_{P}\bigl( \Vert \hat{\eta }-\eta \Vert \bigr). \end{aligned}

Note that

\begin{aligned} &\frac{1}{\sqrt{n}} \sum_{t=1}^{n}Z_{t}X_{t-1} \bigl(X_{t}-E[X_{t}| \mathcal{F}_{t-1}]\bigr) ( \hat{\beta }_{1}-\beta _{1}) \\ &\quad=\sqrt{n}( \hat{\beta }_{1}-\beta _{1})\frac{1}{n} \sum _{t=1}^{n}Z_{t}X_{t-1}\bigl(X _{t}-E[X_{t}|\mathcal{F}_{t-1}]\bigr). \end{aligned}

Notice that $$E[Z_{t}X_{t-1}(X_{t}-E[X_{t}|\mathcal{F}_{t-1}])]=0$$, then, by the ergodicity of the process $$\{X_{t}\}$$, we conclude that

\begin{aligned} \frac{1}{n} \sum_{t=1}^{n}Z_{t}X_{t-1} \bigl(X_{t}-E[X_{t}|\mathcal{F}_{t-1}]\bigr) \rightarrow 0,\quad \text{a.s.} \end{aligned}

Combining with the fact that $$\sqrt{n}(\hat{\beta }_{1}-\beta _{1})$$ converges in distribution, we have

\begin{aligned} \frac{1}{\sqrt{n}} \sum_{t=1}^{n}Z_{t}X_{t-1} \bigl(X_{t}-E[X_{t}| \mathcal{F}_{t-1}]\bigr) ( \hat{\beta }_{1}-\beta _{1})=o_{P}(1). \end{aligned}

Similarly, we can prove that

\begin{aligned} &\frac{1}{\sqrt{n}} \sum_{t=1}^{n}Z_{t}X_{t-2} \bigl(X_{t}-E[X_{t}| \mathcal{F}_{t-1}]\bigr) ( \hat{\beta }_{2}-\beta _{2})=o_{P}(1), \\ &\frac{1}{\sqrt{n}} \sum_{t=1}^{n}Z_{t} \bigl(X_{t}-E[X_{t}|\mathcal{F} _{t-1}]\bigr) ( \hat{\lambda }-\lambda )=o_{P}(1). \end{aligned}

Therefore, we get

\begin{aligned} \frac{1}{\sqrt{n}} \sum_{t=1}^{n}Z_{t} \bigl(V_{t}(\hat{\eta })-V_{t}( \eta )\bigr)=o_{P}(1). \end{aligned}

Again, by the ergodicity of the process $$\{X_{t}\}$$, we have

\begin{aligned} \frac{1}{n}\sum_{t=1}^{n}Z_{t}Z_{t}^{T} \rightarrow \varGamma ,\quad \text{a.s.} \end{aligned}

Thus, we can obtain the conclusion that

\begin{aligned} \sqrt{n}\bigl(\bar{\theta }(\hat{\eta })-\hat{\theta }(\eta )\bigr) =o_{P}(1). \end{aligned}

Then, by the Slutsky theorem, we have

\begin{aligned} \sqrt{n}\bigl(\bar{\theta }(\hat{\eta })-\theta \bigr)=\sqrt{n}\bigl(\bar{\theta }(\hat{\eta })-\bar{\theta }(\eta )\bigr)+ \sqrt{n}\bigl(\hat{\theta }(\eta )- \theta \bigr)\longrightarrow N\bigl(0,\varGamma ^{-1}W \varGamma ^{-1} \bigr). \end{aligned}

Therefore, we conclude that the vector $$(\sqrt{n}(\bar{\theta }( \hat{\eta })-\theta ),\sqrt{n}(\hat{\eta }-\eta ))$$ converges to a normal distribution with mean zero and covariance matrix

$\Omega ={\left({\omega }_{ij}\right)}_{9×9}=\left(\begin{array}{cc}{\Gamma }^{-1}W{\Gamma }^{-1}& {\Gamma }^{-1}\Pi {V}^{-1}\\ {V}^{-1}\Pi {\Gamma }^{-1}& {V}^{-1}\Sigma {V}^{-1}\end{array}\right),$

where $$\varPi =E((V_{t}-Z_{t}^{T}\theta )(X_{t}-\eta ^{T} D_{t})Z_{t}D _{t}^{T})$$, $$D_{t}=(X_{t},X_{t-1},1)^{T}$$. □

### Proof of Theorem 2.3

We can see that the estimators $$\hat{\alpha }_{1},\hat{\alpha }_{2},\hat{p}_{1},\hat{p}_{1}$$ are consistent and have asymptotic normal distribution.

Specially, $$\sqrt{n}(\hat{\alpha }_{1}-\alpha _{1}), \sqrt{n}( \hat{\alpha }_{2}-\alpha _{2})$$ converge to normal distribution with mean zero and variance

\begin{aligned} \begin{aligned} &\frac{\beta _{1}^{2}\omega _{11}+(\beta _{1}^{2}-\theta _{1})^{2}\omega _{77}+\beta _{1}(\beta _{1}^{2}-\theta _{1})\omega _{17}}{\beta _{1}^{4}}, \\ &\frac{\beta _{2}^{2}\omega _{22}+(\beta _{2}^{2}-\theta _{2})^{2}\omega _{88}+\beta _{2}(\beta _{2}^{2}-\theta _{2})\omega _{28}}{\beta _{2}^{4}}. \end{aligned} \end{aligned}
(5.2)

And $$\sqrt{n}(\hat{p}_{1}-p_{1}), \sqrt{n}(\hat{p}_{2}-p_{2})$$ converge to normal distribution with mean zero and variance

\begin{aligned} \begin{aligned}&\frac{\beta _{1}^{2}(2\theta _{1}^{2}+\beta _{1})^{2}\omega _{77}+4\beta _{1}^{4}\theta _{1}\omega _{11} -4\beta _{1}^{3}\theta _{1}(2\theta _{1} ^{2}+\beta _{1})\omega _{17}}{(\theta _{1}^{2}+\beta _{1})^{4}}, \\ &\frac{\beta _{2}^{2}(2\theta _{2}^{2}+\beta _{2})^{2}\omega _{88}+4\beta _{2}^{4}\theta _{2}\omega _{22} -4\beta _{2}^{3}\theta _{2}(2\theta _{2} ^{2}+\beta _{2})\omega _{28}}{(\theta _{2}^{2}+\beta _{2})^{4}}. \end{aligned} \end{aligned}
(5.3)

□

### Proof of Corollary 2.4

Observe that

\begin{aligned} \hat{W}-W ={}&\frac{1}{n-2}\sum_{t=3}^{n} \bigl[\bigl(V_{t}(\hat{\eta })-Z _{t}^{T}\bar{ \theta }_{2}(\hat{\eta })\bigr)^{2}-\bigl(V_{t}-Z_{t}^{T} \theta _{2}\bigr)^{2} \bigr]Z_{t}Z_{t}^{T} \\ ={}&\frac{1}{n-2}\sum_{t=3}^{n} \bigl[V_{t}(\hat{\eta })^{2}-V_{t}^{2} \bigr]Z_{t}Z_{t}^{T}+\frac{1}{n-2}\sum _{t=3}^{n} \bigl[ \bigl(Z_{t}^{T} \bar{ \theta }_{2}(\hat{\eta })\bigr)^{2}- \bigl(Z_{t}^{T}\theta _{2}\bigr)^{2} \bigr]Z_{t}Z _{t}^{T} \\ &{}+ \frac{1}{n-2}\sum_{t=3}^{n} \bigl[2V_{t}Z_{t}^{T}\theta _{2}-2V_{t}( \hat{\eta })Z_{t}^{T}\bar{\theta }_{2}(\hat{\eta }) \bigr]Z_{t}Z_{t} ^{T} \\ :={}& T_{1}+T_{2}+T_{3}. \end{aligned}

Note that

\begin{aligned} T_{1}=\frac{1}{n-2}\sum_{t=3}^{n} \bigl(\hat{\eta }^{T}-\eta ^{T}\bigr)F_{t}Z _{t}Z_{t}^{T} \quad\text{where F_{t}=(X_{t-1},X_{t-2},1).} \end{aligned}

We have

\begin{aligned} \vert T_{1} \vert \leq \bigl\Vert \hat{\eta }^{T}- \eta ^{T} \bigr\Vert \frac{1}{n-2}\sum _{t=3}^{n} \Vert F_{t} \Vert \bigl\Vert Z_{t}Z_{t}^{T} \bigr\Vert \quad\text{where F_{t}=(X_{t-1},X_{t-2},1).} \end{aligned}

By ergodicity of the process $$\{X_{t}\}$$ and the fact that $$\hat{\eta }^{T} \stackrel{P}{\longrightarrow }\eta$$, we have

$$T_{1}=o_{P}(1).$$

Similarly, we have

$$T_{2}=o_{P}(1),\qquad T_{3}=o_{P}(1).$$

Similarly, we can prove that $$\hat{\varSigma }\stackrel{P}{\longrightarrow }\varSigma , \hat{\varPi }\stackrel{P}{\longrightarrow }\varPi$$. The conclusion of this corollary follows. □

## References

1. McKenzie, E.: Some simple models for discrete variate time series. Water Resour. Bull. 21, 645–650 (1985)

2. Al-Osh, M.A., Alzaid, A.A.: First order integer-valued autoregressive (INAR(1)) processes. J. Time Ser. Anal. 8, 261–275 (1987)

3. Alzaid, A.A., Al-Osh, M.A.: First order integer-valued autoregressive (INAR(1)) processes: distributional and regression properties. Stat. Neerl. 42, 53–61 (1988)

4. Zheng, H., Basawa, I.V., Datta, S.: First-order random coefficient integer-valued autoregressive processes. J. Stat. Plan. Inference 173, 212–229 (2007)

5. Zheng, H., Basawa, I.V., Datta, S.: Inference for pth-order random coefficient integer-valued autoregressive processes. J. Time Ser. Anal. 27, 411–440 (2006)

6. Zhang, H., Wang, D., Zhu, F.: Empirical likelihood for first-order random coefficient integer-valued autoregressive processes. Commun. Stat., Theory Methods 40, 492–509 (2011)

7. Zhang, H., Wang, D., Zhu, F.: Empirical likelihood inference for random coefficient INAR(p) process. J. Time Ser. Anal. 32, 195–223 (2011)

8. Zhang, H., Wang, D.: Inference for random coefficient INAR(1) process based on frequency domain analysis. Commun. Stat., Simul. Comput. 44(4), 1078–1100 (2015)

9. Nedényi, F., Pap, G.: Iterated scaling limits for aggregation of random coefficient AR(1) and INAR(1) processes. Stat. Probab. Lett. 118, 16–23 (2016)

10. Ding, X., Wang, D.: Empirical likelihood inference for INAR(1) model with explanatory variables. J. Korean Stat. Soc. 45(4), 623–632 (2016)

11. Nastić, A.S., Ristić, M.M.: Some geometric mixed integer-valued autoregressive (INAR) models. Stat. Probab. Lett. 22, 805–811 (2012)

12. Lawrance, A.J., Lewis, P.A.W.: Modelling and residual analysis of nonlinear autoregressive time series in exponential variables. J. R. Stat. Soc. B 47, 165–202 (1985)

13. Dewald, L.S., Lewis, P.A.W.: A new Laplace second-order autoregressive. IEEE Trans. Inf. Theory 31, 645–651 (1985)

14. Karlsen, H., Tjøstheim, D.: Consistent estimates for the NEAR(2) and NLAR(2) time series models. J. R. Stat. Soc. B 50(2), 313–320 (1988)

15. Zhao, Z., Hu, Y.: Statistical inference for first-order random coefficient integer-valued autoregressive processes. J. Inequal. Appl. 2015, 359 (2015) https://doi.org/10.1186/s13660-015-0886-y

16. Zhang, H., Wang, D., Zhu, F.: Inference for INAR(p) processes with signed generalized power series thinning operator. J. Stat. Plan. Inference 140(3), 667–683 (2010)

## Acknowledgements

The author thanks the editor for the guidance and help.

## Funding

This research was supported by the Science and Technology Development Program of Jilin Province (Grant No. 20170101152JC).

## Author information

Authors

### Contributions

The author read and approved the final manuscript.

### Corresponding author

Correspondence to Xu Wang.

## Ethics declarations

### Competing interests

The author declares to have no competing interests. 