# Empirical likelihood inference for threshold autoregressive conditional heteroscedasticity model

## Abstract

This paper considers the parameter estimation problem of a first-order threshold autoregressive conditional heteroscedasticity model by using the empirical likelihood method. We obtain the empirical likelihood ratio statistic based on the estimating equation of the least squares estimation and construct the confidence region for the model parameters. Simulation studies indicate that the empirical likelihood method outperforms the normal approximation-based method in terms of coverage probability.

## Introduction

Consider the following first-order threshold autoregressive conditional heteroscedasticity model:

$$X_{t}=\theta _{1}X_{t-1}^{+}+ \theta _{2}X_{t-1}^{-}+\varepsilon _{t},$$
(1)

where $$X_{t}^{+}=\max (X_{t},0)$$, $$X_{t}^{-}=\min (X_{t},0)$$, $$\varepsilon _{t}=\sqrt{h_{t}}e_{t}$$, $$h_{t}=\alpha _{0}+\alpha _{1}(\varepsilon _{t-1}^{+})^{2}+\alpha _{2}( \varepsilon _{t-1}^{-})^{2}$$, $$e_{t}$$ is a sequence of independent and identically distributed random variables satisfying $$Ee_{t}=0$$ and $$\operatorname{Var}(e_{t})=1$$. $$\theta _{1}$$, $$\theta _{2}$$, $$\alpha _{0}$$, $$\alpha _{1}$$, and $$\alpha _{2}$$ are the model parameters with $$\alpha _{0}>0$$, $$0\leq \alpha _{j}<1$$, $$j=1, 2$$.

When $$\theta _{1}=\theta _{2}$$, model (1) becomes the usual autoregressive model whose innovation is a conditional heteroscedasticity process. Threshold autoregressive model is a nonlinear time series model. Because the threshold autoregressive model can explain nonlinear features such as asymmetry and limit cycles, it is widely used in time series modeling (see Tong ). Petruccelli and Woolford  first defined model (1) and investigated its properties and the parameter estimation problems, but they assumed that the error sequence is a sequence of independent and identically distributed random variables. Brockwell et al.  and Hwang and Basawa  further generalized the model coefficients to be random variables. Hwang and Woo  first considered the parameter estimation problems when the error sequence is a conditional heteroscedasticity process and proposed to use the conditional least squares method to estimate the model parameters. In this paper, we use the empirical likelihood method to estimate the model parameters.

Similar to the parametric likelihood, Owen  introduced empirical likelihood method. It is a nonparametric likelihood method which establishes a likelihood function through placing positive probability on every one of the observed data values, but often makes no assumptions on the data-generating mechanism. Empirical likelihood has many advantages compared with the normal approximation method. For example, the limiting distribution of empirical likelihood ratio statistic is a chi-squared distribution. Therefore, we need not estimate the asymptotic variance when we construct the confidence region. Moreover, the confidence region is completely decided by the data themselves because we make no assumptions on the probability distribution of the data. These attract the attention of statisticians to make inference for all kinds of statistical models using the empirical likelihood method, such as linear regressive model , generalized linear models , and partially linear models . In recent years, empirical likelihood method is also applied to make statistical inference about time series models, such as autoregressive model , random coefficient autoregressive model , and integer-valued autoregressive model .

In this paper, we obtain the limiting distribution of empirical log-likelihood ratio statistic and construct the confidence region for the parameters in model (1) by using the empirical likelihood method. Some simulation studies indicate that the empirical likelihood method has a higher coverage probability compared with the normal approximation-based method.

This paper is organized as follows. In Sect. 2, we present the main methods and results. Some simulation results and real data analysis are given in Sect. 3. Section 4 is concerned with the proofs of the main results. Moreover, the symbols “$$\stackrel{d}{\longrightarrow }$$” and “$$\stackrel{p}{\longrightarrow }$$” denote convergence in distribution and convergence in probability, respectively. $$O_{p}(1)$$ means a term which is bounded in probability. $$o_{p}(1)$$ means a term which converges to zero in probability. “Almost surely” and “independent identical distributed” are denoted by “a.s.” and “i.i.d.”, respectively.

## Methods and main results

For model (1), Hwang and Woo  obtained the least square estimation of the model parameters and its limiting properties. Now we use the empirical likelihood method to estimate the model parameters. Before giving the main results, we assume that the following conditions are true:

$$(\mathbf{A1})$$:

Probability density function $$f(\cdot )$$ of $$e_{t}$$ has its support on $$(-\infty , +\infty )$$. $$\theta _{\max }+\sqrt{\alpha _{\max }}<1$$, where $$\theta _{\max }=\max \{ \vert \theta _{1} \vert , \vert \theta _{2} \vert \}$$, $$\alpha _{\max }=\max \{\alpha _{1}, \alpha _{2}\}$$.

$$(\mathbf{A2})$$:

$$E(X_{t}^{6})<\infty$$.

According to Theorem 1 in Hwang and Woo , if $$(\mathbf{A1})$$ holds, then $$\{X_{t}, t\geq 1\}$$ is geometrically ergodic, and the sequence $$\{X_{t}\}$$ has a unique stationary distribution.

Hwang and Woo  used the least square method to estimate the model parameters. Let $$\theta =(\theta _{1},\theta _{2})^{\tau }$$. Based on the observation data $$\{X_{0}, X_{1},\ldots , X_{n}\}$$, the least square estimation $$\theta ^{*}$$ of θ can be obtained by minimizing $$Q(\theta )=\sum_{t=1}^{n}(X_{t}-E(X_{t}\mid \mathscr{F}_{t-1}))^{2} =\sum_{t=1}^{n}(X_{t}-\theta _{1}X_{t-1}^{+}-\theta _{2}X_{t-1}^{-})^{2}$$ with respect to θ. Solving

$∂ Q ( θ ) ∂ θ =−2 ∑ t = 1 n ( X t − θ 1 X t − 1 + − θ 2 X t − 1 − ) ( X t − 1 + X t − 1 − ) =0$
(2)

for θ, we know that

$θ ∗ = ( ∑ t = 1 n X t X t − 1 + / ∑ t = 1 n ( X t − 1 + ) 2 ∑ t = 1 n X t X t − 1 − / ∑ t = 1 n ( X t − 1 + ) 2 ) ) .$

Let $$\mathscr{X}_{t}=(X_{t-1}^{+}, X_{t-1}^{-})^{\tau }$$. Then the estimating equation (2) can be written as

\begin{aligned} \sum_{t=1}^{n} \bigl(X_{t}-\mathscr{X}_{t}^{\tau }\theta \bigr) \mathscr{X}_{t}=0. \end{aligned}
(3)

Further, let $$H_{t}(\theta )=(X_{t}-\mathscr{X}_{t}^{\tau }\theta )\mathscr{X}_{t}$$. By (3), we can obtain the following empirical likelihood ratio statistic:

\begin{aligned} L(\theta )=\max \Biggl\{ \prod _{t=1}^{n}np_{t}: \sum _{t=1}^{n}p_{t}H_{t}( \theta )=0, p_{t}\geq 0, \sum_{t=1}^{n}p_{t}=1 \Biggr\} . \end{aligned}
(4)

By using the Lagrange multiplier method, it is easy to know that

\begin{aligned} p_{t}=\frac{1}{n}\frac{1}{1+b^{\tau }(\theta )H_{t}(\theta )}, \end{aligned}
(5)

where the Lagrange multiplier $$b(\theta )$$ satisfies

\begin{aligned} \frac{1}{n}\sum_{t=1}^{n} \frac{H_{t}(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )}=0. \end{aligned}
(6)

Therefore, we have

\begin{aligned} -2\log \bigl(L(\theta ) \bigr)=2\sum _{t=1}^{n}\log \bigl(1+b^{\tau }(\theta )H_{t}( \theta ) \bigr). \end{aligned}
(7)

The following theorem indicates that the limiting distribution of $$-2\log (L(\theta ))$$ is a chi-squared distribution.

### Theorem 2.1

If $$(\mathbf{A1})$$ and $$(\mathbf{A2})$$ hold, then when $$n \rightarrow \infty$$,

$$-2\log \bigl(L(\theta ) \bigr)\stackrel{d}{\longrightarrow }\chi ^{2}{(2)},$$
(8)

where $$\chi ^{2}{(2)}$$ is the chi-squared distribution with two degrees of freedom.

Using the above theorem, we can construct the empirical likelihood ratio confidence region for the parameter θ. For $$0<\delta <1$$, the $$100(1-\delta )\%$$ asymptotic confidence region for the parameter θ is

$$\mathscr{C}\{\delta \}= \bigl\{ \theta : L(\theta )\leq \chi ^{2}_{\delta }(2) \bigr\} ,$$
(9)

where $$\chi ^{2}_{\delta }(2)$$ is the upper δ quantile of chi-squared distribution with two degrees of freedom.

## Simulation studies

In this section, we carry out some simulation studies to compare the performances of our empirical likelihood (EL) method with the least square (LS) method proposed by Hwang and Woo  through random simulation. Consider the simulation results of model (1) in the following error sequence:

Sequence I: $$\{e_{t}\}$$ is a sequence of independent and identically distributed (i.i.d.) standard normal distribution $$N(0,1)$$ random variables.

Sequence II: $$\{e_{t}\}$$ is an independent noise sequence with ϵ-contamination distribution, and the distribution function of $$\{e_{t}\}$$ is

$$F_{e_{t}}(x)=\epsilon \Phi \biggl(\frac{x}{\sigma _{1}} \biggr)+(1-\epsilon ) \Phi \biggl( \frac{x}{\sigma _{2}} \biggr),$$

where $$\sigma _{i}>0$$ ($$i=1, 2$$), ϵ is a fixed constant satisfying $$0 <\epsilon < 1$$ and $$\Phi (x)$$ is the distribution function of the standard normal random variable.

Sequence III: $$\{e_{t}\}$$ is a sequence of independent and identically distributed (i.i.d.) mixing random variable sequence, and the distribution function of $$\{e_{t}\}$$ is

$$F_{e_{t}}(x)=\epsilon \Phi \biggl(\frac{x}{\sigma } \biggr)+(1-\epsilon )T_{k}(x),$$

where $$\sigma >0$$, $$0<\epsilon <1$$, $$T_{k}(x)$$ is the distribution function of T distribution with k degrees of freedom, $$\Phi (x)$$ is the distribution function of standard normal random variable.

We calculate the coverage probabilities of the empirical likelihood and the least square methods for different model parameters. The nominal confidence level $$1-\delta$$ is chosen to be 0.90. All simulation studies are based on 1000 repetitions, and the sample sizes considered in these simulations are $$n =100, 300$$, and 500. The simulation results for sequence I are presented in Table 1. For sequence II, we simulate $$(\epsilon , \sigma _{1}, \sigma _{2})=(0.9, 1,3)$$ and $$(\epsilon , \sigma _{1}, \sigma _{2})=(0.75, 1, \sqrt{7})$$, and the simulation results are presented in Table 2 and Table 3, respectively. For sequence III, we simulate $$(\epsilon , \sigma , k)=(0.2, 1, 6)$$ and $$(\epsilon , \sigma , k)=(0.5, 1, 3)$$, and the simulation results are presented in Table 4 and Table 5, respectively. The first figures in parentheses are the simulation results obtained by the empirical likelihood method, and the second figures are the simulation results obtained by the least square method.

From the simulation results in Tables 15, it can be seen that, for different error distribution, the confidence region constructed by the empirical likelihood method has a higher coverage probabilities for different parameters, sample sizes, pollution levels, and pollution distributions. Moreover, the confidence region constructed by the empirical likelihood method is closer to the confidence level 0.90. This shows that the empirical likelihood method is more robust than the least square method.

## Real data analysis

In this section, we use our method to fit student teacher ratio (number of teachers = 1) data in Chinese universities, which are provided by the website of China National Bureau of Statistics (http://data.stats.gov.cn/easyquery.htm?cn=C01&zb=A060E01&sj=2019). Student teacher ratio is an important index to measure the level of universities. There are 70 available observations, which are denoted by $$X_{1}, X_{2},\ldots , X_{70}$$. The observations represent the yearly counts of student teacher ratio in China over the period from 1949 to 2018. Let $$Y_{t} ={X_{t}}-{X_{t-1}}$$. The plot of sample path, autocorrelation function (ACF), and partial autocorrelation function (PACF) for the series $$\{Y_{t}\}$$ are given in Figs. 1, 2, and 3, respectively. The corresponding plots of sample autocorrelation function (ACF) and partial autocorrelation function (PACF) indicate an $$\operatorname{AR}(1)$$-like autocorrelation structure.

In what follows, based on the observation data $$\{Y_{t}\}$$, we give the figure of the empirical likelihood ratio confidence region when the confidence level is 0.95 (see Fig. 4). After a simple calculation, we know that the least square estimation $$\theta ^{*}=( 0.1616, 0.6191)$$, and it is denoted by in Fig. 4. From Fig. 4, we can see that the least square estimation $$\theta ^{*}$$ is in the empirical likelihood ratio confidence region. Moreover, the empirical likelihood ratio confidence region is relatively small although the confidence level is 0.95.

## Proofs

In order to establish Theorem 2.1, we first prove the following lemmas.

### Lemma 5.1

If $$(\mathbf{A1})$$ and $$(\mathbf{A2})$$ hold, then

$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n}H_{t}( \theta ) \stackrel{d}{\longrightarrow }N(0,D) \quad \textit{as } n\rightarrow \infty ,$$
(10)

where

$$D= \begin{pmatrix} E(\varepsilon _{1}^{2}(X_{0}^{+})^{2}) & 0 \\ 0 & E(\varepsilon _{1}^{2}(X_{0}^{-})^{2}) \end{pmatrix}.$$

### Proof

Note that

\begin{aligned} \sqrt{n} \bigl(\theta ^{\ast }-\theta \bigr) =&\sqrt{n} \Biggl( \Biggl( \sum_{t=1}^{n}\mathscr{X}_{t} \mathscr{X}_{t}^{\tau } \Biggr)^{-1} \sum _{t=1}^{n}X_{t}\mathscr{X}_{t}- \theta \Biggr) \\ =&\sqrt{n} \Biggl( \Biggl(\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{\tau } \Biggr)^{-1} \sum_{t=1}^{n}X_{t} \mathscr{X}_{t}- \Biggl(\sum_{t=1}^{n} \mathscr{X}_{t} \mathscr{X}_{t}^{\tau } \Biggr)^{-1}\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{ \tau }\theta \Biggr) \\ =&\sqrt{n} \Biggl(\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{\tau } \Biggr)^{-1} \Biggl( \sum_{t=1}^{n}X_{t} \mathscr{X}_{t}-\sum_{t=1}^{n} \mathscr{X}_{t} \mathscr{X}_{t}^{\tau }\theta \Biggr) \\ =&\sqrt{n} \Biggl(\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{\tau } \Biggr)^{-1} \Biggl( \sum_{t=1}^{n} \mathscr{X}_{t} \bigl(X_{t}-\mathscr{X}_{t}^{\tau } \theta \bigr) \Biggr) \\ =& \Biggl(\frac{1}{n}\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{\tau } \Biggr)^{-1} \frac{1}{\sqrt{n}} \Biggl(\sum _{t=1}^{n} \mathscr{X}_{t} \bigl(X_{t}-\mathscr{X}_{t}^{ \tau } \theta \bigr) \Biggr). \end{aligned}

Therefore we have

\begin{aligned} \frac{1}{\sqrt{n}}\sum_{t=1}^{n}H_{t}( \theta ) =&\frac{1}{\sqrt{n}}\sum_{t=1}^{n} \bigl(X_{t}-\mathscr{X}_{t}^{\tau } \theta \bigr) \mathscr{X}_{t} \\ =&\frac{1}{n}\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{\tau } \Biggl( \frac{1}{n}\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{\tau } \Biggr)^{-1} \frac{1}{\sqrt{n}} \Biggl(\sum _{t=1}^{n} \mathscr{X}_{t} \bigl(X_{t}-\mathscr{X}_{t}^{ \tau } \theta \bigr) \Biggr) \\ =&\frac{1}{n}\sum_{t=1}^{n} \mathscr{X}_{t}\mathscr{X}_{t}^{\tau } \sqrt{n} \bigl( \theta ^{\ast }-\theta \bigr). \end{aligned}
(11)

By the ergodic theorem, we have

$1 n ∑ t = 1 n X t X t τ ⟶ a.s. E ( X t X t τ ) = E ( ( X t − 1 + X t − 1 − ) ( X t − 1 + X t − 1 − ) ) = E ( ( X t − 1 + ) 2 0 0 ( X t − 1 − ) 2 ) = ( E ( X t − 1 + ) 2 0 0 E ( X t − 1 − ) 2 ) ≜ W .$

According to the result of Lemma 1 in Hwang and Woo , we have

$$\sqrt{n} \bigl(\theta ^{\ast }-\theta \bigr)\stackrel{d}{ \longrightarrow }N \bigl(0,W^{-1}DW^{-1} \bigr) \quad \text{as } n \rightarrow \infty .$$
(12)

Combining with (11), we know that Lemma 5.1 holds. □

### Lemma 5.2

If $$(\mathbf{A1})$$ and $$(\mathbf{A2})$$ hold, then

$$\frac{1}{{n}}\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \stackrel{p}{\longrightarrow }D \quad \textit{as } n\rightarrow \infty .$$
(13)

### Proof

Notice that

$1 n ∑ t = 1 n H t ( θ ) H t τ ( θ ) = 1 n ∑ t = 1 n ( X t − X t τ θ ) 2 X t X t τ = 1 n ∑ t = 1 n ( X t − X t τ θ ) 2 ( ( X t − 1 + ) 2 0 0 ( X t − 1 − ) 2 ) = 1 n ∑ t = 1 n ( ( X t − X t τ θ ) 2 ( X t − 1 + ) 2 0 0 ( X t − X t τ θ ) 2 ( X t − 1 − ) 2 ) = 1 n ∑ t = 1 n ( ε t 2 ( X t − 1 + ) 2 0 0 ε t 2 ( X t − 1 − ) 2 ) .$

Therefore, according to the ergodic theorem, Lemma 5.2 is established. □

### Lemma 5.3

If $$(\mathbf{A1})$$ and $$(\mathbf{A2})$$ hold, then

$$\max_{1\leq t\leq n} \bigl\Vert H_{t}(\theta ) \bigr\Vert =o_{p} \bigl(n^{\frac{1}{2}} \bigr) \quad \textit{as } n \rightarrow \infty .$$
(14)

### Proof

Based on assumption $$(\mathbf{A2})$$, we know that $$E(H_{t}(\theta )H^{\tau }_{t}(\theta ))<\infty$$, which implies that

$$\sum_{n=1}^{\infty }P \bigl(H_{t}( \theta )H^{\tau }_{t}(\theta )>n \bigr)< \infty .$$

Further by assumption $$(\mathbf{A1})$$, we know that the model is stationary, from which we can conclude that

$$\sum_{n=1}^{\infty }P \bigl(H_{n}( \theta )H^{\tau }_{n}(\theta )>n \bigr)< \infty .$$

By using the Borel–Cantelli lemma, we know that $$\Vert H_{t}(\theta ) \Vert >n^{\frac{1}{2}}$$ holds for finite n, which implies that $$\max_{1\leq t\leq n} \Vert H_{t}(\theta ) \Vert >n^{\frac{1}{2}}$$ holds for finite n. Similarly, we can obtain that for $$\forall \varepsilon >0$$, $$\max_{1\leq t\leq n} \Vert H_{t}(\theta ) \Vert >\varepsilon n^{\frac{1}{2}}$$ holds for finite n. Thus, Lemma 5.3 is established. □

### Lemma 5.4

If $$(\mathbf{A1})$$ and $$(\mathbf{A2})$$ hold, then

$$\Vert b \Vert =O_{p} \bigl(n^{-\frac{1}{2}} \bigr).$$
(15)

### Proof

Let $$b= \Vert b \Vert \varsigma$$. By (6), we have

\begin{aligned} 0 =&\frac{1}{n}\sum_{t=1}^{n} \frac{\varsigma ^{\tau }H_{t}(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )} \\ =&\varsigma ^{\tau }\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta ) \biggl(1- \frac{b^{\tau }(\theta )H_{t}(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )} \biggr) \\ =&\varsigma ^{\tau }\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta )- \varsigma ^{\tau }\frac{1}{n}\sum _{t=1}^{n} \frac{H_{t}(\theta )H^{\tau }_{t}(\theta )b(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )} \\ =&\varsigma ^{\tau }\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta )- \varsigma ^{\tau }\frac{1}{n}\sum _{t=1}^{n} \frac{H_{t}(\theta )H^{\tau }_{t}(\theta )\varsigma \Vert b(\theta ) \Vert }{1+b^{\tau }(\theta )H_{t}(\theta )} \\ =&\varsigma ^{\tau }\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta )- \bigl\Vert b(\theta ) \bigr\Vert \varsigma ^{\tau } \frac{1}{n}\sum_{t=1}^{n} \frac{H_{t}(\theta )H^{\tau }_{t}(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )} \varsigma . \end{aligned}

Hence we have

\begin{aligned} \varsigma ^{\tau }\frac{1}{n}\sum _{t=1}^{n}H_{t}(\theta ) =& \bigl\Vert b( \theta ) \bigr\Vert \varsigma ^{\tau }\tilde{D}_{n} \varsigma , \end{aligned}
(16)

where

$$\tilde{D}_{n}=\frac{1}{n}\sum_{t=1}^{n} \frac{H_{t}(\theta )H^{\tau }_{t}(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )}.$$

Let $${D}_{n}=\frac{1}{n}\sum_{t=1}^{n}H_{t}(\theta )H^{\tau }_{t}( \theta )$$. From (5), we can see that $$1+b^{\tau }(\theta )H_{t}(\theta )>0$$. Thus we have

\begin{aligned} \bigl\Vert b(\theta ) \bigr\Vert \varsigma ^{\tau }{D}_{n} \varsigma \leq & \bigl\Vert b(\theta ) \bigr\Vert \varsigma ^{\tau } \frac{1}{n}\sum_{t=1}^{n} \frac{H_{t}(\theta )H^{\tau }_{t}(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )} \varsigma \Bigl(1+\max_{1\leq t\leq n} b^{\tau }( \theta )H_{t}(\theta ) \Bigr) \\ \leq & \bigl\Vert b(\theta ) \bigr\Vert \varsigma ^{\tau } \tilde{D}_{n}\varsigma \Bigl(1+\max_{1 \leq t\leq n} b^{\tau }(\theta )H_{t}(\theta ) \Bigr) \\ \leq & \bigl\Vert b(\theta ) \bigr\Vert \varsigma ^{\tau } \tilde{D}_{n}\varsigma \Bigl(1+ \bigl\Vert b( \theta ) \bigr\Vert \max_{1\leq t\leq n} \bigl\Vert H_{t}(\theta ) \bigr\Vert \Bigr) \\ =&\varsigma ^{\tau }\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta ) \Bigl(1+ \bigl\Vert b(\theta ) \bigr\Vert \max_{1\leq t\leq n} \bigl\Vert H_{t}(\theta ) \bigr\Vert \Bigr), \end{aligned}
(17)

which implies that

\begin{aligned} \bigl\Vert b(\theta ) \bigr\Vert \Biggl(\varsigma ^{\tau }{D}_{n}\varsigma -\max_{1\leq t\leq n} \bigl\Vert H_{t}(\theta ) \bigr\Vert \varsigma ^{\tau } \frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta ) \Biggr) \leq &\varsigma ^{\tau }\frac{1}{n}\sum _{t=1}^{n}H_{t}( \theta ). \end{aligned}
(18)

According to Lemma 5.1, we have

$$\frac{1}{\sqrt{n}}\sum_{t=1}^{n}H_{t}( \theta )=O_{p}(1),$$
(19)

which implies that

$$\varsigma ^{\tau }\frac{1}{n}\sum _{t=1}^{n}H_{t}(\theta )=O_{p} \bigl(n^{- \frac{1}{2}} \bigr).$$
(20)

Further, by Lemma 5.3, we obtain

\begin{aligned} \max_{1\leq t\leq n} \bigl\Vert H_{t}(\theta ) \bigr\Vert \varsigma ^{\tau }\frac{1}{n} \sum _{t=1}^{n}H_{t}( \theta ) =& \frac{1}{\sqrt{n}}\varsigma ^{\tau }\max_{1\leq t\leq n} \bigl\Vert H_{t}( \theta ) \bigr\Vert \frac{1}{\sqrt{n}} \sum _{t=1}^{n}H_{t}(\theta ) \\ =&\frac{1}{\sqrt{n}}o_{p} \bigl(n^{\frac{1}{2}} \bigr)O_{p}(1) \\ =&o_{p}(1). \end{aligned}
(21)

Note that D is a positive definite matrix. Thus we have

\begin{aligned} \varsigma ^{\tau }{D}_{n}\varsigma \stackrel{p}{\longrightarrow } \varsigma ^{\tau }{D}\varsigma >0 \end{aligned}
(22)

and

\begin{aligned} \sigma _{\min } + o_{p}(1)\leq \varsigma ^{\tau }{D}_{n}\varsigma \leq \sigma _{\max } + o_{p}(1), \end{aligned}
(23)

where $$\sigma _{\max }$$ and $$\sigma _{\min }$$ are the smallest and the largest eigenvalue of D, respectively. Combining with (18)–(23), we can obtain that

\begin{aligned} \bigl\Vert b(\theta ) \bigr\Vert \bigl(\varsigma ^{\tau }{D}_{n}\varsigma +o_{p}(1) \bigr) =&O_{p} \bigl(n^{- \frac{1}{2}} \bigr). \end{aligned}
(24)

Combined with (22), we know that $$\Vert b(\theta ) \Vert =O_{p}(n^{-\frac{1}{2}})$$. Lemma 5.4 is established. □

### Lemma 5.5

If $$(\mathbf{A1})$$ and $$(\mathbf{A2})$$ hold, then

$$b(\theta )= \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr)^{-1} \sum _{t=1}^{n}H_{t}(\theta )+B_{n},$$
(25)

where

$$B_{n}= \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr)^{-1} \sum _{t=1}^{n} \frac{H_{t}(\theta )(b^{\tau }(\theta )H_{t}(\theta ))^{2}}{1+b^{\tau }(\theta )H_{t}(\theta )},$$
(26)

and

$$\Vert B_{n} \Vert =o_{p} \bigl(n^{-\frac{1}{2}} \bigr).$$
(27)

### Proof

By (6), we have

\begin{aligned} 0 =&\frac{1}{n}\sum_{t=1}^{n} \frac{ H_{t}(\theta )}{1+b^{\tau }(\theta )H_{t}(\theta )} \\ =&\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta ) \biggl(1-b^{\tau }(\theta )H_{t}( \theta )+ \frac{(b^{\tau }(\theta )H_{t}(\theta ))^{2}}{1+b^{\tau }(\theta )H_{t}(\theta )} \biggr) \\ =&\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta )-\frac{1}{n}\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta )b(\theta )+ \frac{1}{n}\sum _{t=1}^{n} \frac{H_{t}(\theta )(b^{\tau }(\theta )H_{t}(\theta ))^{2}}{1+b^{\tau }(\theta )H_{t}(\theta )} \Vert . \end{aligned}

Thus, (25) can be established.

In what follows, we consider (27). Note that

\begin{aligned} \Vert B_{n} \Vert =& \Biggl\Vert \Biggl( \sum_{t=1}^{n}H_{t}(\theta )H^{\tau }_{t}( \theta ) \Biggr)^{-1} \sum _{t=1}^{n} \frac{H_{t}(\theta )(b^{\tau }(\theta )H_{t}(\theta ))^{2}}{1+b^{\tau }(\theta )H_{t}(\theta )} \Biggr\Vert \\ =& \Biggl\Vert \Biggl(\frac{1}{n}\sum _{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}( \theta ) \Biggr)^{-1} \frac{1}{n}\sum_{t=1}^{n} \frac{H_{t}(\theta )(b^{\tau }(\theta )H_{t}(\theta ))^{2}}{1+b^{\tau }(\theta )H_{t}(\theta )} \Biggr\Vert \\ \leq & \Biggl\Vert \Biggl(\frac{1}{n}\sum _{t=1}^{n}H_{t}(\theta )H^{\tau }_{t}( \theta ) \Biggr)^{-1} \Biggr\Vert \Biggl\Vert \frac{1}{n} \sum_{t=1}^{n} \frac{H_{t}(\theta )(b^{\tau }(\theta )H_{t}(\theta ))^{2}}{1+b^{\tau }(\theta )H_{t}(\theta )} \Biggr\Vert \\ \leq & \Biggl\Vert \Biggl(\frac{1}{n}\sum _{t=1}^{n}H_{t}(\theta )H^{\tau }_{t}( \theta ) \Biggr)^{-1} \Biggr\Vert \bigl\Vert b^{\tau }(\theta ) \bigr\Vert ^{2}\frac{1}{n}\sum_{t=1}^{n} \frac{ \Vert H_{t}(\theta ) \Vert ^{3}}{ \Vert 1+b^{\tau }(\theta )H_{t}(\theta ) \Vert } \\ \leq &O_{p}(1)O_{p} \bigl(n^{-\frac{1}{2}} \bigr)O_{p} \bigl(n^{-\frac{1}{2}} \bigr)O_{p}(1) \\ =&o_{p} \bigl(n^{-\frac{1}{2}} \bigr). \end{aligned}
(28)

So (27) holds. □

### Lemma 5.6

If $$(\mathbf{A1})$$ and $$(\mathbf{A2})$$ hold, then

\begin{aligned} -2\log \bigl(L(\theta ) \bigr) =& \Biggl(\sum _{t=1}^{n}H_{t}(\theta ) \Biggr)^{\tau } \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr)^{-1}\sum _{t=1}^{n}H_{t}( \theta ) \\ &{}-B^{\tau }_{n}\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}( \theta )B_{n}+2\sum _{t=1}^{n}\eta _{t}, \end{aligned}
(29)

where

\begin{aligned}& B^{\tau }_{n}\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta )B_{n}=o_{p}(1), \end{aligned}
(30)
\begin{aligned}& \sum_{t=1}^{n}\eta _{t}=o_{p}(1). \end{aligned}
(31)

### Proof

We expand

\begin{aligned} -2\log \bigl(L(\theta ) \bigr) =&2\sum _{t=1}^{n}\log \bigl(1+b^{\tau }(\theta )H_{t}(\theta ) \bigr) \\ =&2\sum_{t=1}^{n}b^{\tau }( \theta )H_{t}(\theta )-\sum_{t=1}^{n} \bigl(b^{\tau }(\theta )H_{t}(\theta ) \bigr)^{2}+2 \sum_{t=1}^{n} \eta _{t} \\ =&2 \Biggl( \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr)^{-1} \sum _{t=1}^{n}H_{t}(\theta )+B_{n} \Biggr)^{\tau }\sum_{t=1}^{n}H_{t}( \theta ) \\ & {} - \Biggl( \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr)^{-1} \sum _{t=1}^{n}H_{t}(\theta )+B_{n} \Biggr)^{\tau } \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr) \\ & {} \times \Biggl( \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr)^{-1} \sum _{t=1}^{n}H_{t}(\theta )+B_{n} \Biggr) \\ & {} +2\sum_{t=1}^{n}\eta _{t}. \end{aligned}
(32)

After a simple algebraic operation, we know that (29) holds.

Next, we consider (30). Note that

\begin{aligned} \Vert B^{\tau }_{n}\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta )B_{n} \Vert \leq & \bigl\Vert B^{\tau }_{n} \bigr\Vert \Biggl\Vert \sum_{t=1}^{n}H_{t}(\theta )H^{\tau }_{t}( \theta ) \Biggr\Vert \Vert B_{n} \Vert \\ =&o_{p} \bigl(n^{-\frac{1}{2}} \bigr)o_{p} \bigl(n^{-\frac{1}{2}} \bigr)O_{p}(n)=o_{p}(1), \end{aligned}
(33)

which implies that (30) holds.

Last, we consider (31). For this, we first prove that there exists a finite real number $$Q>0$$ such that

\begin{aligned} P \bigl( \vert \eta _{t} \vert \leq Q \bigl\vert b^{\tau }(\theta ) H_{t}(\theta ) \bigr\vert ^{3}, 1\leq t \leq n \bigr)\longrightarrow 1 \quad \text{as } n \longrightarrow \infty . \end{aligned}
(34)

Consider the third-order Taylor expansion of $$\log (1+x)$$ at $$x=0$$:

$$\log (1+x)=x-\frac{x^{2}}{2}+\frac{x^{3}}{3}+\varpi (x),$$

where $$\frac{\varpi (x)}{x^{3}}\rightarrow 0$$ as $$x\rightarrow 0$$. Therefore, there exists $$\iota >0$$ such that, for any $$\vert x \vert <\iota$$, $$\vert \frac{\varpi (x)}{x^{3}} \vert <\frac{1}{6}$$. In addition, note that

$$\max_{1\leq t\leq n} \bigl\Vert b^{\tau }(\theta )H_{t}( \theta ) \bigr\Vert =o_{p}(1).$$

Therefore, we have

$$\lim_{n\rightarrow \infty }P \Bigl\{ \max_{1\leq t\leq n} \bigl\vert b^{\tau }(\theta ) H_{t}(\theta ) \bigr\vert ^{3}< \iota ^{3} \Bigr\} =1.$$

Let $$A_{n}=\{\omega : \max_{1\leq t\leq n} \vert b^{\tau }(\theta ) H_{t}( \theta )\vert ^{3}<\iota ^{3}\}$$. It is easy to prove that, for any $$\omega \in A_{n}$$ and $$1\leq t\leq n$$,

$$\frac{ \vert \eta _{t} \vert }{ \vert b^{\tau }(\theta ) H_{t}(\theta ) \vert ^{3}}= \frac{ \vert \frac{(b^{\tau }(\theta ) H_{t}(\theta ))^{3}}{3}+\varpi (b^{\tau }(\theta ) H_{t}(\theta )) \vert }{ \vert b^{\tau }(\theta ) H_{t}(\theta ) \vert ^{3}}\leq \frac{1}{3}+ \frac{1}{6}=\frac{1}{2}.$$

Thus we have

$$P \bigl( \vert \eta _{t} \vert \leq Q \bigl\vert b^{\tau }(\theta ) H_{t}(\theta ) \bigr\vert ^{3}, 1\leq t \leq n \bigr) \longrightarrow 1 \quad \text{as } n\longrightarrow \infty ,$$

where $$Q=\frac{1}{2}$$. This implies that

\begin{aligned} \Biggl\Vert \sum_{t=1}^{n} \eta _{t} \Biggr\Vert \leq &Q \bigl\Vert b(\theta ) \bigr\Vert ^{3}\sum_{t=1}^{n} \bigl\Vert H_{t}(\theta ) \bigr\Vert ^{3} \\ \leq &O_{p} \bigl(n^{-\frac{3}{2}} \bigr)o_{p} \bigl(n^{\frac{3}{2}} \bigr) \\ =&o_{p}(1). \end{aligned}

□

### Proof of Theorem 2.1

By Lemma 5.6, we can conclude that $$-2\log (L(\theta ))$$ and $$(\sum_{t=1}^{n}H_{t}(\theta ))^{\tau }(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ))^{-1}\sum_{t=1}^{n}H_{t}( \theta )$$ have the same limit distribution. By Lemma 5.1 and Lemma 5.2, we can conclude that

$$\sum_{t=1}^{n}H_{t}^{\tau }( \theta ) \Biggl(\sum_{t=1}^{n}H_{t}( \theta )H^{\tau }_{t}(\theta ) \Biggr)^{-1}\sum _{t=1}^{n}H_{t}( \theta ) \stackrel{d}{\longrightarrow }\chi ^{2}(2).$$

Hence Theorem 2.1 holds. □

## Availability of data and materials

The data used to support the findings of this study are available from the corresponding author upon request.

## References

1. 1.

Tong, H.: Nonlinear Time Series. Oxford University Press, Oxford (1990)

2. 2.

Petruccelli, J.D., Woolford, S.W.: A threshold $$\operatorname{AR}(1)$$ model. J. Appl. Probab. 21, 270–286 (1984)

3. 3.

Brockwell, P.J., Liu, J., Tweedie, R.L.: On the existence of stationary threshold autoregressive moving-average processes. J. Time Ser. Anal. 13, 95–107 (1992)

4. 4.

Hwang, S.Y., Basawa, I.V.: Large sample inference for conditional exponential families with applications to nonlinear time series. J. Stat. Plan. Inference 38, 141–158 (1994)

5. 5.

Hwang, S.Y., Woo, M.J.: Threshold $$\operatorname{ARCH}(1)$$ processes: asymptotic inference. Stat. Probab. Lett. 55, 11–20 (2001)

6. 6.

Owen, A.B.: Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237–249 (1988)

7. 7.

Owen, A.B.: Empirical likelihood confidence regions. Ann. Stat. 18, 90–120 (1990)

8. 8.

Owen, A.B.: Empirical Likelihood. Chapman & Hall, New York (2001)

9. 9.

Owen, A.B.: Empirical likelihood for linear models. Ann. Stat. 19, 1725–1747 (1991)

10. 10.

Chen, S.X.: Empirical likelihood confidence intervals for linear regression coefficients. J. Multivar. Anal. 49, 24–40 (1994)

11. 11.

Xue, L.G.: Empirical likelihood for linear models with missing responses. J. Multivar. Anal. 100, 1353–1366 (2009)

12. 12.

Qin, Y.S., Li, Y.H.: Empirical likelihood for linear models under negatively associated errors. J. Syst. Sci. Complex. 102, 153–163 (2011)

13. 13.

Qin, J., Lawless, J.: Empirical likelihood and general estimating equations. Ann. Stat. 22, 300–325 (1994)

14. 14.

Kolaczyk, E.D.: Empirical likelihood for generalized linear models. Stat. Sin. 4, 199–218 (1994)

15. 15.

Chen, S.X., Cui, H.: An extended empirical likelihood for generalized linear models. Stat. Sin. 13, 69–81 (2003)

16. 16.

Xue, D., Xue, L.G., Cheng, W.H.: Empirical likelihood for generalized linear models with missing responses. J. Stat. Plan. Inference 141, 2007–2020 (2011)

17. 17.

Bai, Y., Fung, W.K., Zhu, Z.: Weighted empirical likelihood for generalized linear models with longitudinal data. J. Stat. Plan. Inference 140, 3446–3456 (2010)

18. 18.

Shi, J., Lau, T.S.: Empirical likelihood for partially linear models. J. Multivar. Anal. 72, 132–148 (2000)

19. 19.

Li, G.R., Lin, L., Zhu, L.X.: Empirical likelihood for a varying coefficient partially linear model with diverging number of parameters. J. Multivar. Anal. 105, 85–111 (2012)

20. 20.

Lu, X.W.: Empirical likelihood for heteroscedastic partially linear models. J. Multivar. Anal. 100, 387–396 (2009)

21. 21.

Yan, L., Chen, X.: Empirical likelihood for partly linear models with errors in all variables. J. Multivar. Anal. 130, 275–288 (2014)

22. 22.

Li, J.Y., Liang, W., He, S.Y., Wu, X.B.: Empirical likelihood for the smoothed LAD estimator in infinite variance autoregressive models. Stat. Probab. Lett. 80, 1420–1430 (2010)

23. 23.

Zhao, Z.W., Wang, D.H.: Empirical likelihood for an autoregressive model with explanatory variables. Commun. Stat., Theory Methods 40, 559–570 (2011)

24. 24.

Liu, X.H., Peng, L.: Asymptotic theory and unified confidence region for an autoregressive model. J. Time Ser. Anal. 40, 43–65 (2019)

25. 25.

Zhao, Z.W., Wang, D.H.: Statistical inference for generalized random coefficient autoregressive model. Math. Comput. Model. 56, 152–166 (2012)

26. 26.

Zhao, Z.W., Wang, D.H., Peng, C.X.: Coefficient constancy test in generalized random coefficient autoregressive model. Appl. Math. Comput. 219, 10283–10292 (2013)

27. 27.

Zhao, Z.W., Wang, D.H., Peng, C.X., Zhang, M.L.: Empirical likelihood-based inference for stationary-ergodicity of the generalized random coefficient autoregressive model. Commun. Stat., Theory Methods 44, 2586–2599 (2015)

28. 28.

Zhao, Z.W., Li, Y., Peng, C.X., Fu, Z.H.: Empirical likelihood-based inference in generalized random coefficient autoregressive model with conditional moment restrictions. J. Comput. Appl. Math. 348, 146–160 (2019)

29. 29.

Zhang, H.X., Wang, D.H., Zhu, F.K.: Empirical likelihood inference for random coefficient $$\operatorname{INAR}(p)$$ process. J. Time Ser. Anal. 32, 195–203 (2011)

30. 30.

Peng, C.X., Wang, D.H., Zhao, Z.W.: Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions. J. Inequal. Appl. 2015, 218 (2015)

31. 31.

Ding, X., Wang, D.H.: Empirical likelihood inference for $$\operatorname{INAR}(1)$$ model with explanatory variables. J. Korean Stat. Soc. 45, 623–632 (2016)

32. 32.

Zhao, Z.W., Wang, D.H., Peng, C.X.: Conditional heteroscedasticity test for Poisson autoregressive model. Commun. Stat., Theory Methods 46, 4437–4448 (2017)

## Acknowledgements

We thank the editor and the anonymous referee for their constructive comments and suggestions, which greatly improved this paper.

## Funding

This work is supported by the National Natural Science Foundation of China (No. 11571138, 11671054, 11301137, 11271155, 11371168, J1310022, 11501241). The national social science fund of China (16BTJ020).

## Author information

Authors

### Contributions

All authors contributed equally and significantly in this manuscript, and they read and approved the final manuscript.

### Corresponding author

Correspondence to Cuixin Peng.

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests.

## Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions 