Empirical likelihood inference for threshold autoregressive conditional heteroscedasticity model

This paper considers the parameter estimation problem of a first-order threshold autoregressive conditional heteroscedasticity model by using the empirical likelihood method. We obtain the empirical likelihood ratio statistic based on the estimating equation of the least squares estimation and construct the confidence region for the model parameters. Simulation studies indicate that the empirical likelihood method outperforms the normal approximation-based method in terms of coverage probability.

When θ 1 = θ 2 , model (1) becomes the usual autoregressive model whose innovation is a conditional heteroscedasticity process. Threshold autoregressive model is a nonlinear time series model. Because the threshold autoregressive model can explain nonlinear features such as asymmetry and limit cycles, it is widely used in time series modeling (see Tong [1]). Petruccelli and Woolford [2] first defined model (1) and investigated its properties and the parameter estimation problems, but they assumed that the error sequence is a sequence of independent and identically distributed random variables. Brockwell et al. [3] and Hwang and Basawa [4] further generalized the model coefficients to be random variables. Hwang and Woo [5] first considered the parameter estimation problems when the error sequence is a conditional heteroscedasticity process and proposed to use the conditional least squares method to estimate the model parameters. In this paper, we use the empirical likelihood method to estimate the model parameters.
Similar to the parametric likelihood, Owen [6][7][8] introduced empirical likelihood method. It is a nonparametric likelihood method which establishes a likelihood function through placing positive probability on every one of the observed data values, but often makes no assumptions on the data-generating mechanism. Empirical likelihood has many advantages compared with the normal approximation method. For example, the limiting distribution of empirical likelihood ratio statistic is a chi-squared distribution. Therefore, we need not estimate the asymptotic variance when we construct the confidence region. Moreover, the confidence region is completely decided by the data themselves because we make no assumptions on the probability distribution of the data. These attract the attention of statisticians to make inference for all kinds of statistical models using the empirical likelihood method, such as linear regressive model [9][10][11][12][13], generalized linear models [14][15][16][17], and partially linear models [18][19][20][21]. In recent years, empirical likelihood method is also applied to make statistical inference about time series models, such as autoregressive model [22][23][24], random coefficient autoregressive model [25][26][27][28], and integer-valued autoregressive model [29][30][31][32].
In this paper, we obtain the limiting distribution of empirical log-likelihood ratio statistic and construct the confidence region for the parameters in model (1) by using the empirical likelihood method. Some simulation studies indicate that the empirical likelihood method has a higher coverage probability compared with the normal approximationbased method.
This paper is organized as follows. In Sect. 2, we present the main methods and results. Some simulation results and real data analysis are given in Sect. 3. Section 4 is concerned with the proofs of the main results. Moreover, the symbols " d − →" and " p − →" denote convergence in distribution and convergence in probability, respectively. O p (1) means a term which is bounded in probability. o p (1) means a term which converges to zero in probability. "Almost surely" and "independent identical distributed" are denoted by "a.s. " and "i.i.d. ", respectively.

Methods and main results
For model (1), Hwang and Woo [5] obtained the least square estimation of the model parameters and its limiting properties. Now we use the empirical likelihood method to estimate the model parameters. Before giving the main results, we assume that the following conditions are true: (A1) Probability density function f (·) of e t has its support on (-∞, +∞). θ max + √ α max < 1, According to Theorem 1 in Hwang and Woo [5], if (A1) holds, then {X t , t ≥ 1} is geometrically ergodic, and the sequence {X t } has a unique stationary distribution.
Hwang and Woo [5] used the least square method to estimate the model parameters. Let θ = (θ 1 , θ 2 ) τ . Based on the observation data {X 0 , X 1 , . . . , X n }, the least square estimation θ * of θ can be obtained by minimizing for θ , we know that .
Then the estimating equation (2) can be written as Further, let H t (θ ) = (X t -X τ t θ )X t . By (3), we can obtain the following empirical likelihood ratio statistic: By using the Lagrange multiplier method, it is easy to know that where the Lagrange multiplier b(θ ) satisfies Therefore, we have The following theorem indicates that the limiting distribution of -2 log(L(θ )) is a chisquared distribution.
where χ 2 (2) is the chi-squared distribution with two degrees of freedom.
Using the above theorem, we can construct the empirical likelihood ratio confidence region for the parameter θ . For 0 < δ < 1, the 100(1δ)% asymptotic confidence region for the parameter θ is where χ 2 δ (2) is the upper δ quantile of chi-squared distribution with two degrees of freedom.

Simulation studies
In this section, we carry out some simulation studies to compare the performances of our empirical likelihood (EL) method with the least square (LS) method proposed by Hwang and Woo [5] through random simulation. Consider the simulation results of model (1) in the following error sequence: Sequence I: {e t } is a sequence of independent and identically distributed (i.i.d.) standard normal distribution N(0, 1) random variables.
Sequence II: {e t } is an independent noise sequence with -contamination distribution, and the distribution function of {e t } is where σ i > 0 (i = 1, 2), is a fixed constant satisfying 0 < < 1 and (x) is the distribution function of the standard normal random variable. Sequence III: {e t } is a sequence of independent and identically distributed (i.i.d.) mixing random variable sequence, and the distribution function of {e t } is where σ > 0, 0 < < 1, T k (x) is the distribution function of T distribution with k degrees of freedom, (x) is the distribution function of standard normal random variable. We calculate the coverage probabilities of the empirical likelihood and the least square methods for different model parameters. The nominal confidence level 1δ is chosen to be 0.90. All simulation studies are based on 1000 repetitions, and the sample sizes considered in these simulations are n = 100, 300, and 500. The simulation results for sequence I are presented in Table 1. For sequence II, we simulate ( , σ 1 , σ 2 ) = (0.9, 1, 3) and ( , σ 1 , σ 2 ) = (0.75, 1, √ 7), and the simulation results are presented in Table 2 and Table 3, respectively. For sequence III, we simulate ( , σ , k) = (0.2, 1, 6) and ( , σ , k) = (0.5, 1, 3), and the simulation results are presented in Table 4 and Table 5, respectively. The first figures in parentheses are the simulation results obtained by the empirical likelihood method, and the second figures are the simulation results obtained by the least square method.
From the simulation results in Tables 1-5, it can be seen that, for different error distribution, the confidence region constructed by the empirical likelihood method has a higher coverage probabilities for different parameters, sample sizes, pollution levels, and pollution distributions. Moreover, the confidence region constructed by the empirical likelihood method is closer to the confidence level 0.90. This shows that the empirical likelihood method is more robust than the least square method.

Real data analysis
In this section, we use our method to fit student teacher ratio (number of teachers = 1) data in Chinese universities, which are provided by the website of China National Bureau of Statistics (http://data.stats.gov.cn/easyquery.htm?cn=C01&zb=A060E01&sj=2019). Student teacher ratio is an important index to measure the level of universities. There are  Table 2 The simulation results for sequence II ( , σ 1 , σ 2 ) = (0.9, 1, 3)   (-0.1, -0.1, 1, 0.1, 0.1 The corresponding plots of sample autocorrelation function (ACF) and partial autocorrelation function (PACF) indicate an AR(1)-like autocorrelation structure.
In what follows, based on the observation data {Y t }, we give the figure of the empirical likelihood ratio confidence region when the confidence level is 0.95 (see Fig. 4). After a Table 5 The simulation results for sequence III ( , σ , k) = (0.5, 1, 3) simple calculation, we know that the least square estimation θ * = (0.1616, 0.6191), and it is denoted by * in Fig. 4. From Fig. 4, we can see that the least square estimation θ * is in the empirical likelihood ratio confidence region. Moreover, the empirical likelihood ratio confidence region is relatively small although the confidence level is 0.95.

Proofs
In order to establish Theorem 2.1, we first prove the following lemmas. where . Figure 4 The figure of the empirical likelihood ratio confidence region Proof Note that Therefore we have By the ergodic theorem, we have According to the result of Lemma 1 in Hwang and Woo [5], we have Combining with (11), we know that Lemma 5.1 holds.
Therefore, according to the ergodic theorem, Lemma 5.2 is established.
Note that D is a positive definite matrix. Thus we have ς τ D n ς p − → ς τ Dς > 0 (22) and where σ max and σ min are the smallest and the largest eigenvalue of D, respectively. Combining with (18)-(23), we can obtain that b(θ ) ς τ D n ς + o p (1) = O p n -1 2 .