Skip to main content

Maximum likelihood estimators in linear regression models with Ornstein-Uhlenbeck process

Abstract

The paper studies the linear regression model

y t = x t T β+ ε t ,t=1,2,,n,

where

d ε t =λ(μ ε t )dt+σd B t ,

with parameters λ,σ R + , μR and { B t ,t0} the standard Brownian motion. Firstly, the maximum likelihood (ML) estimators of β, λ and σ 2 are given. Secondly, under general conditions, the asymptotic properties of the ML estimators are investigated. And then, limiting distributions for likelihood ratio test statistics of the hypothesis are also given. Lastly, the validity of the method are illuminated by two real examples.

MSC:62J05, 62M10, 60J60.

1 Introduction

Consider the following linear regression model

y t = x t T β+ ε t ,t=1,2,,n,
(1.1)

where y t ’s are scalar response variables, x t ’s are explanatory variables, β is an m-dimensional unknown parameter, and { ε t } is an Ornstein-Uhlenbeck process, which satisfies the linear stochastic differential equation (SDE)

d ε t =λ(μ ε t )dt+σd B t
(1.2)

with parameters λ,σ R + , μR and { B t ,t0} the standard Brownian motion.

It is well known that a linear regression model is the most important and popular model in the statistical literature, which attracts many people to investigate the model. For an ordinary linear regression model (when the errors are independent and identically distributed (i.i.d.) random variables), Wang and Zhou [1], Anatolyev [2], Bai and Guo [3], Chen [4], Gil et al. [5], Hampel et al. [6], Cui [7], Durbin [8] and Li and Yang [9] used various estimation methods to obtain estimators of the unknown parameters in (1.1) and discussed some large or small sample properties of these estimators. Recently, linear regression with serially correlated errors has attracted increasing attention from statisticians and economists. One case of considerable interest is that the errors are autoregressive processes; Hu [10], Wu [11], and Fox and Taqqu [12] established its asymptotic normality with the usual p n -normalization in the case of long memory stationary Gaussian observations errors. Giraitis and Surgailis [13] extended this result to non-Gaussian linear sequences. Koul and Surgailis [14] established the asymptotic normality of the Whittle estimator in linear regression models with non-Gaussian long memory moving average errors. Shiohama and Taniguchi [15] estimated the regression parameters in a linear regression model with autoregressive process. Fan [16] investigated moderate deviations for M-estimators in linear models with ϕ-mixing errors.

The Ornstein-Uhlenbeck process was originally introduced by Ornstein and Uhlenbeck [17] as a model for particle motion in a fluid. In physical sciences, the Ornstein-Uhlenbeck process is a prototype of a noisy relaxation process, whose probability density function f(x,t) can be described by the Fokker-Planck equation (see Janczura et al. [18], Debbasch et al. [19], Gillespie [20], Ditlevsen and Lansky [21], Garbaczewski and Olkiewicz [22], Plastino and Plastino [23]):

f ( x , t ) t = x ( λ ( x μ ) f ( x , t ) ) + σ 2 2 2 f ( x , t ) x 2 .

This process is now widely used in many areas of application. The main characteristic of the Ornstein-Uhlenbeck process is the tendency to return towards the long-term equilibrium μ. This property, known as mean-reversion, is found in many real life processes, e.g., in commodity and energy price processes (see Fasen [24], Yu [25], Geman [26]). There are a number of papers concerned with the Ornstein-Uhlenbeck process, for example, Janczura et al. [18], Zhang et al. [27], Rieder [28], Iacus [29], Bishwal [30], Shimizu [31], Zhang and Zhang [32], Chronopoulou and Viens [33], Lin and Wang [34] and Xiao et al. [35]. It is well known that the solution of model (1.2) is an autoregressive process. For a constant or functional or random coefficient autoregressive model, many people (for example, Magdalinos [36], Andrews and Guggenberger [37], Fan and Yao [38], Berk [39], Goldenshluger and Zeevi [40], Liebscher [41], Baran et al. [42], Distaso [43] and Harvill and Ray [44]) used various estimation methods to obtain estimators and discussed some asymptotic properties of these estimators, or investigated hypotheses testing.

By (1.1) and (1.2), we can obtain that the more general process satisfies the SDE

d y t =λ ( L ( t , λ , μ , β ) y t ) dt+σd B t ,
(1.3)

where L(t,λ,μ,β) is a time-dependent mean reversion level with three parameters. Thus, model (1.3) is a general Ornstein-Uhlenbeck process. Its special cases have gained much attention and have been applied to many fields such as economics, physics, geography, geology, biology and agriculture. Dehling et al. [45] considered the model with maximum likelihood estimate, and proved strong consistency and asymptotic normality. Lin and Wang [34] established the existence of a successful coupling for a class of stochastic differential equations given by (1.3). Bishwal [30] investigated the uniform rate of weak convergence of the minimum contrast estimator in the Ornstein-Uhlenbeck process (1.3).

The solution of model (1.2) is given by

ε t = e λ t ε 0 +μ ( 1 e λ t ) +σ 0 t e λ ( s t ) d B t ,
(1.4)

where 0 t e λ ( s t ) d B t N(0, 1 exp 2 λ t 2 λ ).

The process observed in discrete time is more relevant in statistics and economics. Therefore, by (1.4), the Ornstein-Uhlenbeck time series for t=1,2,,n is given by

ε t = e λ d ε t 1 +μ ( 1 e λ d ) +σ 1 e 2 λ d 2 λ η t ,
(1.5)

where η t N(0,1) i.i.d. random errors and with equidistant time lag d, fixed in advance. Models (1.1) and (1.5) include many special cases such as a linear regression model with constant coefficient autoregressive processes (when μ=0; see Hu [10], Wu [11], Maller [46], Pere [47] and Fuller [48]), Ornstein-Uhlenbeck time series or processes (when β=0; see Rieder [28], Iacus [29], Bishwal [30], Shimizu [31] and Zhang and Zhang [32]), constant coefficient autoregressive processes (when μ=0, β=0; see Chambers [49], Hamilton [50], Brockwell and Davis [51] and Abadir and Lucas [52], etc.).

The paper discusses models (1.1) and (1.5). The organization of the paper is as follows. In Section 2 some estimators of β, θ and σ 2 are given by the quasi-maximum likelihood method. Under general conditions, the existence and consistency of the quasi-maximum likelihood estimators as well as asymptotic normality are investigated in Section 3. The hypothesis testing is given in Section 4. Some preliminary lemmas are presented in Section 5. The main proofs of theorems are presented in Section 6, with two real examples in Section 7.

2 Estimation method

Without of loss generality, we assume that μ=0, ε 0 =0 in the sequel. Write the ‘true’ model as

y t = x t T β 0 + e t ,t=1,2,,n
(2.1)

and

e t =exp( λ 0 d) e t 1 + σ 0 1 exp ( 2 λ 0 d ) 2 λ 0 η t ,
(2.2)

where η t N(0,1) i.i.d.

By (2.2), we have

e t = σ 0 1 exp ( 2 λ 0 d ) 2 λ 0 j = 1 t exp { λ 0 d ( t j ) } η j .
(2.3)

Thus e t is measurable with respect to the σ-field H generated by η 1 , η 2 ,, η t , and

E e t =0,Var( e t )= σ 0 2 1 exp ( 2 λ 0 d ) 2 λ 0 exp { λ 0 d t ( t 1 ) } .
(2.4)

Using similar arguments as those of Rieder [28] or Maller [46], we get the log-likelihood of y 2 , y 3 ,, y n conditional on y 1 ,

Ψ n ( β , λ , σ 2 ) = log L n = 1 2 ( n 1 ) log ( π σ 2 λ ) 1 2 ( n 1 ) log ( 1 exp ( 2 λ d ) ) λ σ 2 ( 1 exp ( 2 λ d ) ) t = 2 n ( ε t exp ( λ d ) ε t 1 ) 2 .
(2.5)

We maximize (2.5) to obtain QML estimators denoted by σ ˆ n 2 , β ˆ n , λ ˆ n (when they exist). Then the first derivatives of Ψ n may be written as

Ψ n σ 2 = n 1 2 σ 2 + λ σ 4 ( 1 exp ( 2 λ d ) ) t = 2 n ( ε t exp ( λ d ) ε t 1 ) 2 ,
(2.6)
Ψ n λ = n 1 2 λ ( n 1 ) d exp ( 2 λ d ) 1 exp ( 2 λ d ) 2 d λ exp ( λ d ) σ 2 ( 1 exp ( 2 λ d ) ) t = 2 n ( ε t exp ( λ d ) ε t 1 ) ε t 1 Ψ n λ = 1 ( 1 + 2 d λ ) exp ( 2 λ d ) σ 2 ( 1 exp ( 2 λ d ) ) 2 t = 2 n ( ε t exp ( λ d ) ε t 1 ) 2
(2.7)

and

Ψ n β = 2 λ σ 2 ( 1 exp ( 2 λ d ) ) t = 2 n ( ε t exp ( λ d ) ε t 1 ) ( x t exp ( λ d ) x t 1 ) .
(2.8)

Thus σ ˆ n 2 , β ˆ n , λ ˆ n satisfy the following estimation equations:

σ ˆ n 2 = 2 λ ˆ n ( n 1 ) ( 1 exp ( 2 λ ˆ n d ) ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) 2 ,
(2.9)
σ ˆ n 2 ( 1 ( 1 + 2 d λ ˆ n ) exp ( 2 λ ˆ n d ) ) 2 λ ˆ n 2 d λ ˆ n exp ( λ ˆ n d ) n 1 t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) ε ˆ t 1 1 ( 1 + 2 d λ ˆ n ) exp ( 2 λ ˆ n d ) ( 1 exp ( 2 λ ˆ n d ) ) ( n 1 ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) 2 = 0
(2.10)

and

t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) ( x t exp ( λ ˆ n d ) x t 1 ) =0,
(2.11)

where

ε ˆ t = y t x t T β ˆ n .
(2.12)

To obtain our results, the following conditions are sufficient (see Maller [46]).

(A1) X n = t = 2 n x t x t T is positive definite for sufficiently large n and

lim n max 1 t n x t T X n 1 x t =0.
(2.13)

(A2)

lim sup n | λ ˜ | max ( X n 1 2 Z n X n T 2 ) <1,
(2.14)

where Z n = 1 2 t = 2 n ( x t x t 1 T + x t 1 x t T ), | λ ˜ | max () denotes the maximum in absolute value of the eigenvalues of a symmetric matrix.

For ease of exposition, we shall introduce the following notations which will be used later in the paper.

Let (m+1)-vector θ=(β,λ). Define

S n (θ)= σ 2 Ψ n θ = σ 2 ( Ψ n β , Ψ n λ ) , F n (θ)= σ 2 2 Ψ n θ θ T .
(2.15)

By (2.7) and (2.8), we get the components of F n (θ)

σ 2 2 Ψ n β β T = 2 λ 1 exp ( 2 λ d ) t = 2 n ( x t exp ( λ d ) x t 1 ) ( x t exp ( λ d ) x t 1 ) T σ 2 2 Ψ n β β T = 2 λ 1 exp ( 2 λ d ) X n ( λ ) ,
(2.16)
σ 2 2 Ψ n β λ = 2 d λ exp ( λ d ) 1 exp ( 2 λ d ) t = 2 n ( ε t 1 x t + ε t x t 1 2 exp ( λ d ) x t 1 ε t 1 ) σ 2 2 Ψ n β λ = 1 ( 1 + 2 d λ ) exp ( 2 λ d ) ( 1 exp ( 2 λ d ) ) 2 σ 2 2 Ψ n β λ = t = 2 n ( ε t exp ( λ d ) ε t 1 ) ( x t exp ( λ d ) x t 1 )
(2.17)

and

σ 2 2 Ψ n λ 2 = σ 2 ( n 1 ) 2 λ 2 2 σ 2 ( n 1 ) d 2 exp ( 2 λ d ) ( 1 exp ( 2 λ d ) ) 2 + 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) t = 2 n ε t 1 2 + 2 d ( 1 d λ ( 1 + d λ ) exp ( 2 λ d ) ) exp ( λ d ) ( 1 exp ( 2 λ d ) ) 2 t = 2 n ( ε t exp ( λ d ) ε t 1 ) ε t 1 + 2 d exp ( λ d ) [ 1 ( 1 + 2 d λ ) exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 2 t = 2 n ( ε t exp ( λ d ) ε t 1 ) ε t 1 + 4 d exp ( 2 λ d ) [ d λ 1 + ( 1 + d λ ) exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 3 t = 2 n ( ε t exp ( λ d ) ε t 1 ) 2 = σ 2 ( n 1 ) 2 λ 2 2 σ 2 ( n 1 ) d 2 exp ( 2 λ d ) ( 1 exp ( 2 λ d ) ) 2 + 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) t = 2 n ε t 1 2 + 2 d exp ( λ d ) [ ( 2 d λ ) d λ exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 2 t = 2 n ( ε t exp ( λ d ) ε t 1 ) ε t 1 + 4 d exp ( 2 λ d ) [ d λ 1 + ( 1 + d λ ) exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 3 t = 2 n ( ε t exp ( λ d ) ε t 1 ) 2 .
(2.18)

Hence we have

F n (θ)=( 2 λ 1 exp ( 2 λ d ) X n ( λ ) σ 2 2 Ψ n β λ σ 2 2 Ψ n λ 2 ),
(2.19)

where the indicates that the elements are filled in by symmetry. By (2.18), we have

E { σ 2 2 Ψ n λ 2 | θ = θ 0 } = ( n 1 ) σ 0 2 { 1 2 λ 0 2 + 2 d exp ( 2 λ 0 d ) [ 1 + ( 1 + d λ 0 ) exp ( 2 λ 0 d ) ] λ 0 ( 1 exp ( 2 λ 0 d ) ) 2 } + 2 d 2 λ 0 exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) t = 2 n E e t 1 2 = ( n 1 ) σ 0 2 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 2 2 λ 0 2 ( 1 exp ( 2 λ 0 d ) ) 2 + 2 d 2 λ 0 exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) t = 2 n E e t 1 2 = Δ n ( θ 0 , σ 0 ) = O ( n ) .
(2.20)

Thus,

D n = E ( F n ( θ 0 ) ) = ( 2 λ 0 1 exp ( 2 λ 0 d ) X n ( λ 0 ) 0 0 Δ n ( θ 0 , σ 0 ) . )
(2.21)

3 Large sample properties of the estimators

Theorem 3.1 Suppose that conditions (A1)-(A2) hold. Then there is a sequence A n 0 such that, for each A>0, as n, the probability

P { there are estimators  θ ˆ n , σ ˆ n 2  with  S n ( θ ˆ n ) = 0 , and  ( θ ˆ n , σ ˆ n 2 ) N n ( A ) } 1.
(3.1)

Furthermore,

( θ ˆ n , σ ˆ n 2 ) p ( θ 0 , σ 0 2 ) ,n,
(3.2)

where, for each n=1,2, , A>0 and A n (0, σ 0 2 ), define neighborhoods

N n (A)= { θ R m + 1 : ( θ θ 0 ) T D n ( θ θ 0 ) A 2 }
(3.3)

and

N n (A)= N n (A) { σ 2 [ σ 0 2 A n , σ 0 2 + A n ] } .
(3.4)

Theorem 3.2 Suppose that conditions (A1)-(A2) hold. Then

1 σ n ˆ F n T 2 ( θ ˆ n )( θ ˆ n θ 0 ) D N(0, I m + 1 ),n.
(3.5)

In the following, we will investigate some special cases in models (1.1) and (1.5). From Theorem 3.1 and Theorem 3.2, we obtain the following results. Here we omit their proofs.

Corollary 3.1 If β=0, then

Δ n ( θ 0 , σ 0 ) σ ˆ n ( λ ˆ n λ 0 ) D N(0,1),n.
(3.6)

Corollary 3.2 If β=0, then

n ( λ ˆ n λ 0 ) D N ( 0 , σ 0 2 ) ,n.
(3.7)

4 Hypothesis testing

In order to fit a data set { y t ,t=1,2,,n}, we may use model (1.3) or an Ornstein-Uhlenbeck process with a constant mean level model

d y t =λ(μ y t )dt+σd B t .
(4.1)

If β0, then we use model (1.3), namely models (1.1) and (1.2). If β=0, then we use model (1.4). How to know β=0 or β0? In the section, we shall consider the question about hypothesis testing and obtain limiting distributions for likelihood ratio (LR) test statistics (see Fan and Jiang [53]).

Under the null hypothesis

H 0 : β 0 =0, λ 0 >0, σ 0 >0,
(4.2)

let β ˆ 0 n , λ ˆ 0 n , σ ˆ 0 n 2 be the corresponding ML estimators of β, λ, σ 2 . Also let

L ˆ n =2 Ψ n ( β ˆ n , λ ˆ n , σ ˆ n 2 )
(4.3)

and

L ˆ 0 n =2 Ψ n ( β ˆ 0 n , λ ˆ 0 n , σ ˆ 0 n 2 ) .
(4.4)

By (2.9) and (2.5), we have that

L ˆ n = ( n 1 ) log ( π σ ˆ n 2 λ ˆ n ) + ( n 1 ) log ( 1 exp ( 2 λ ˆ n d ) ) + 2 λ ˆ n σ ˆ n 2 ( 1 exp ( 2 λ ˆ n d ) ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) 2 = ( n 1 ) log ( π σ ˆ n 2 λ ˆ n ) + ( n 1 ) log ( 1 exp ( 2 λ ˆ n d ) ) + ( n 1 ) = ( n 1 ) log ( π + 1 ) + ( n 1 ) log ( σ ˆ n 2 ) + ( n 1 ) ( log ( 1 exp ( 2 λ ˆ n d ) ) log ( λ ˆ n ) ) .
(4.5)

And similarly,

L ˆ 0 n = ( n 1 ) log ( π + 1 ) + ( n 1 ) log ( σ ˆ 0 n 2 ) + ( n 1 ) ( log ( 1 exp ( 2 λ ˆ 0 n d ) ) log ( λ ˆ 0 n ) ) .
(4.6)

By (4.5) and (4.6), we have

d ˜ ( n ) = L ˆ 0 n L ˆ n = ( n 1 ) log ( σ ˆ 0 n 2 σ ˆ n 2 ) + ( n 1 ) ( log ( 1 exp ( 2 λ ˆ 0 n d ) 1 exp ( 2 λ ˆ n d ) ) log ( λ ˆ 0 n λ ˆ n ) ) = ( n 1 ) ( σ ˆ 0 n 2 σ ˆ n 2 1 ) + ( n 1 ) ( 1 exp ( 2 λ ˆ 0 n d ) 1 exp ( 2 λ ˆ n d ) λ ˆ 0 n λ ˆ n ) + o p ( 1 ) = ( n 1 ) σ ˆ 0 n 2 σ ˆ n 2 σ 0 2 + ( n 1 ) ( 1 exp ( 2 λ ˆ 0 n d ) 1 exp ( 2 λ ˆ n d ) λ ˆ 0 n λ ˆ n ) + o p ( 1 ) = ( n 1 ) σ ˆ 0 n 2 σ ˆ n 2 σ 0 2 + o p ( 1 ) .
(4.7)

Large values of d ˜ (n) suggest rejection of the null hypothesis.

Theorem 4.1 Suppose that conditions (A1)-(A2) hold. If H 0 holds, then

d ˜ (n) D χ 2 (m),n.
(4.8)

5 Some lemmas

Throughout this paper, let C denote a generic positive constant which could take different value at each occurrence. To prove our main results, we first introduce the following lemmas.

Lemma 5.1 If condition (A1) holds, then for any λ R + the matrix X n (λ) is positive definite for large enough n, and

lim n max 1 t n x t T X n 1 (λ) x t =0.

Proof Let λ ˜ 1 and λ ˜ m be the smallest and largest roots of | Z n λ ˜ X n |=0. Then from Ex. 22.1 of Rao [54],

λ ˜ 1 u T Z n u u T X n u λ ˜ m

for unit vectors u. Thus by (2.18) there are some δ(0,1) and n 0 (δ) such that n N 0 implies

| u T Z n u|(1δ) u T X n u.
(5.1)

By (2.16) and (5.1), we have

u T X n ( λ ) u = t = 2 n ( u T ( x t exp ( λ d ) x t 1 ) ) 2 t = 2 n ( u T x t ) 2 + min λ exp ( 2 λ d ) t = 2 n ( u T x t 1 ) 2 max λ exp ( λ d ) u T Z n u u T X n u + min λ exp ( 2 λ d ) u T X n u u T Z n u ( 1 + min λ exp ( 2 λ d ) ( 1 δ ) ) u T X n u = ( min λ exp ( 2 λ d ) + δ ) u T X n u = C ( λ , δ ) u T X n u .
(5.2)

By Rao [[54], p.60] and (2.17), we have

( u T x t ) 2 u T X n u 0.
(5.3)

From (5.3) and C(λ,δ)>0,

x t T X n 1 (λ) x t = sup u ( ( u T x t ) 2 u T X n ( λ ) u ) sup u ( ( u T x t ) 2 C ( λ , δ ) u T X n u ) 0.
(5.4)

 □

Lemma 5.2 The matrix D n is positive definite for large enough n, E( S n ( θ 0 ))=0 and Var( S n ( θ 0 ))= σ 0 2 D n .

Proof Note that X n ( λ 0 ) is positive definite and Δ n ( θ 0 , σ 0 )>0. It is easy to show that the matrix D n is positive definite for large enough n. By (2.8), we have

σ 0 2 E ( Ψ n β | θ = θ 0 ) = 2 λ 0 1 exp ( 2 λ 0 d ) t = 2 n E ( e t exp ( λ 0 d ) e t 1 ) ( x t exp ( λ 0 d ) x t 1 ) = 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 1 exp ( 2 d λ 0 ) 2 λ 0 t = 2 n ( x t exp ( λ 0 d ) x t 1 ) E η t = 0 .
(5.5)

Note that e t 1 and η t are independent, so we have E( η t e t 1 )=0. Thus, by (2.7) and E η t =0, we have

E ( Ψ n λ | θ = θ 0 ) = n 1 2 λ 0 ( n 1 ) d exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) 0 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) σ 0 2 ( 1 exp ( 2 λ 0 d ) ) 2 σ 0 2 1 exp ( 2 d λ 0 ) 2 λ 0 t = 2 n E η t 2 = n 1 2 λ 0 ( n 1 ) d exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) 2 λ 0 ( 1 exp ( 2 λ 0 d ) ) ( n 1 ) = 0 .
(5.6)

Hence, from (5.5) and (5.6),

E ( S n ( θ 0 ) ) = σ 0 2 E ( Ψ n β | θ = θ 0 , Ψ n λ | θ = θ 0 ) =0.
(5.7)

By (2.8) and (2.20), we have

Var ( σ 0 2 Ψ n β | θ = θ 0 ) = Var { 2 λ 0 1 exp ( 2 λ 0 d ) t = 2 n ( e t exp ( λ 0 d ) e t 1 ) ( x t exp ( λ 0 d ) x t 1 ) } = 2 σ 0 2 λ 0 1 exp ( 2 λ 0 d ) Var { t = 2 n ( x t exp ( λ 0 d ) x t 1 ) η t } = 2 σ 0 2 λ 0 1 exp ( 2 λ 0 d ) X n ( λ 0 ) .
(5.8)

Note that { η t e t 1 , H t } is a martingale difference sequence with

Var( η t e t 1 )=E η t 2 E e t 1 2 =E e t 1 2 ,

so

Var ( σ 0 2 Ψ n λ | θ = θ 0 ) = E { σ 0 d exp ( λ 0 d ) 2 λ 0 1 exp ( λ 0 d ) t = 2 n η t e t 1 } 2 + E { σ 0 2 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 d λ 0 ) ] 2 λ 0 ( 1 exp ( 2 λ 0 d ) ) t = 2 n ( η t 2 1 ) } 2 + 2 σ 0 3 d exp ( λ 0 d ) [ 1 ( 1 + 2 d λ 0 ) exp ( 2 d λ 0 ) ] λ 0 ( 1 exp ( 2 λ 0 d ) ) 3 2 E { t = 2 n η t e t 1 t = 2 n ( η t 2 1 ) } = 2 λ 0 σ 0 2 d 2 exp ( 2 λ 0 d ) 1 exp ( λ 0 d ) t = 2 n E e t 1 2 + { σ 0 2 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 d λ 0 ) ] 2 λ 0 ( 1 exp ( 2 λ 0 d ) ) } 2 ( n 1 ) ( E η t 4 1 ) + 2 σ 0 3 d exp ( λ 0 d ) [ 1 ( 1 + 2 d λ 0 ) exp ( 2 d λ 0 ) ] λ 0 ( 1 exp ( 2 λ 0 d ) ) 3 2 ( t = 2 n E ( ( η t 2 1 ) η t e t 1 ) + t k E ( η t e t 1 ( η k 2 1 ) ) ) = 2 λ 0 σ 0 2 d 2 exp ( 2 λ 0 d ) 1 exp ( λ 0 d ) t = 2 n E e t 1 2 + 2 ( n 1 ) { σ 0 2 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 d λ 0 ) ] 2 λ 0 ( 1 exp ( 2 λ 0 d ) ) } 2 = σ 0 2 Δ n ( θ 0 , σ 0 ) .
(5.9)

By (2.7), (2.8), and noting that e t 1 and η t are independent, we have

Cov ( σ 0 2 Ψ n β , σ 0 2 Ψ n λ ) | θ = θ 0 = σ 0 3 1 ( 1 + 2 d λ ) exp ( 2 λ d ) 2 λ 0 ( 1 exp ( 2 λ d ) ) 3 2 E ( t = 2 n η t 2 t = 2 n η t ( x t exp ( λ d ) x t 1 ) ) = σ 0 3 1 ( 1 + 2 d λ ) exp ( 2 λ d ) 2 λ 0 ( 1 exp ( 2 λ d ) ) 3 2 E η t 3 t = 2 n ( x t exp ( λ d ) x t 1 ) = 0 .
(5.10)

From (5.8)-(5.10), it follows that Var( S n ( θ 0 ))= σ 0 2 D n . The proof is completed. □

Lemma 5.3 (Maller [55])

Let W n be a symmetric random matrix with eigenvalues λ ˜ j (n), 1jd. Then

W n p I λ ˜ j (n) p 1,n.

Lemma 5.4 For each A>0,

sup θ N n ( A ) D n 1 2 F n ( θ ) D n T 2 Φ n p 0,n
(5.11)

and also

Φ n D Φ,
(5.12)
lim c 0 lim sup A lim sup n P { inf θ N n ( A ) λ min ( D n 1 2 F n ( θ ) D n T 2 ) c } =0,
(5.13)

where

Φ n =( λ ( 1 exp ( 2 d λ 0 ) ) λ 0 ( 1 exp ( 2 d λ ) ) I m 0 0 σ 2 2 Ψ n λ 2 | θ = θ 0 Δ n ( θ 0 , σ 0 ) ),Φ= I m + 1 .
(5.14)

Proof Let X n ( λ 0 )= X n 1 2 ( λ 0 ) X n T 2 ( λ 0 ) be a square root decomposition of X n ( λ 0 ). Then

D n = ( 2 λ 0 1 exp ( 2 d λ 0 ) X n 1 2 ( λ 0 ) 0 0 Δ n ( θ 0 , σ 0 ) ) ( 2 λ 0 1 exp ( 2 d λ 0 ) X n T 2 ( λ 0 ) 0 0 Δ n ( θ 0 , σ 0 ) ) = D n 1 2 D n T 2 .
(5.15)

Let θ N n (A). Then

( θ θ 0 ) T D n ( θ θ 0 ) = 2 λ 0 1 exp ( 2 d λ 0 ) ( β β 0 ) T X n ( λ 0 ) ( β β 0 ) + ( λ λ 0 ) 2 Δ n ( θ 0 , σ 0 ) A 2 .
(5.16)

From (2.20), (2.21) and (5.14),

D n 1 2 F n (θ) D n T 2 Φ n =( W 11 W 12 W 22 ),
(5.17)

where

W 11 = λ ( 1 exp ( 2 d λ 0 ) ) λ 0 ( 1 exp ( 2 d λ ) ) { X n 1 2 ( λ 0 ) X n ( λ ) X n T 2 ( λ 0 ) I m } ,
(5.18)
W 12 = 1 exp ( 2 d λ 0 ) 2 λ 0 X n 1 2 ( λ 0 ) ( σ 2 2 Ψ n β λ ) Δ n ( θ 0 , σ 0 )
(5.19)

and

W 22 = σ 2 2 Ψ n λ 2 σ 2 2 Ψ n λ 2 | θ = θ 0 Δ n ( θ 0 , σ 0 ) .
(5.20)

Let

N n β (A)= { β : 2 λ 0 1 exp ( 2 d λ 0 ) | ( β β 0 ) T X n 1 2 ( λ 0 ) | 2 A 2 }
(5.21)

and

N n λ (A)= { θ : | λ λ 0 | A Δ n ( θ 0 , σ 0 ) } .
(5.22)

As the first step, we will show that, for each A>0,

sup θ N n θ ( A ) W 11 0,n.
(5.23)

In fact, note that

W 11 = λ ( 1 exp ( 2 d λ 0 ) ) λ 0 ( 1 exp ( 2 d λ ) ) X n 1 2 ( λ 0 ) ( X n ( λ ) X n ( λ 0 ) ) X n T 2 ( λ 0 ) = λ ( 1 exp ( 2 d λ 0 ) ) λ 0 ( 1 exp ( 2 d λ ) ) X n 1 2 ( λ 0 ) ( T 1 + T 2 T 3 ) X n T 2 ( λ 0 ) ,
(5.24)

where

T 1 = t = 2 n ( exp ( d λ 0 ) exp ( d λ ) ) x t 1 ( x t exp ( d λ 0 ) x t 1 ) T , T 2 = t = 2 n ( exp ( d λ 0 ) exp ( d λ ) ) ( x t exp ( d λ 0 ) x t 1 ) x t T

and

T 3 = t = 2 n ( exp ( d λ ) exp ( d λ 0 ) ) 2 x t 1 x t 1 T .

Let u,v R d , |u|=|v|=1, and let u n T = u T X n 1 2 ( λ 0 ), v n T = X n T 2 ( λ 0 )v. By the Cauchy-Schwarz inequality, Lemma 5.1 and noting N n λ (A), we have

| u n T T 1 v n | = | ( exp ( d λ 0 ) exp ( d λ ) ) t = 2 n u n T x t 1 ( x t exp ( d λ 0 ) x t 1 ) T v n | max | exp ( d λ 0 ) exp ( d λ ) | ( t = 2 n u n T x t x t T u n ) 1 2 ( t = 2 n v n T ( x t exp ( d λ 0 ) x t 1 ) ( x t exp ( d λ 0 ) x t 1 ) T v n ) 1 2 d | λ 0 λ | n max 1 t n ( x t T X n 1 ( λ 0 ) x t ) 1 C n Δ n ( θ 0 , σ 0 ) o ( 1 ) 0 .
(5.25)

Similar to the proof of T 1 , we easily obtain

| u n T T 2 v n |0.
(5.26)

By the Cauchy-Schwarz inequality, Lemma 5.1 and noting N n λ (A), we have

| u n T T 3 v n | = | u n T t = 2 n ( exp ( d λ 0 ) exp ( d λ ) ) 2 x t 1 x t 1 T v n | max | exp ( d λ 0 ) exp ( d λ ) | 2 ( t = 2 n u n T x t x t T u n t = 2 n v n T x t x t T v n ) 1 2 n | λ 0 λ | 2 max 1 t n ( x t T X n 1 ( λ 0 ) x t ) n A 2 Δ n ( θ 0 , σ 0 ) o ( 1 ) 0 .
(5.27)

Hence, (5.23) follows from (5.24)-(5.27).

For the second step, we will show that

W 12 p 0.
(5.28)

Note that

ε t = y t x t T β= x t T ( β 0 β)+ e t
(5.29)

and

ε t exp(d λ 0 ) ε t 1 = ( x t exp ( d λ 0 ) x t 1 ) T ( β 0 β)+ σ 0 1 exp ( 2 d λ 0 ) 2 λ 0 η t .
(5.30)

Write

J = 1 exp ( 2 d λ 0 ) 2 λ 0 X n 1 2 ( λ 0 ) ( σ 2 2 Ψ n β λ ) = 1 exp ( 2 d λ 0 ) 2 λ 0 X n 1 2 ( λ 0 ) 2 d λ exp ( λ d ) 1 exp ( 2 λ d ) t = 2 n ( ε t 1 x t + ε t x t 1 2 exp ( λ d ) x t 1 ε t 1 ) 1 exp ( 2 d λ 0 ) 2 λ 0 X n 1 2 ( λ 0 ) 1 ( 1 + 2 d λ ) exp ( 2 λ d ) ( 1 exp ( 2 λ d ) ) 2 t = 2 n ( ε t exp ( λ d ) ε t 1 ) ( x t exp ( λ d ) x t 1 ) = 1 exp ( 2 d λ 0 ) 2 λ 0 2 d λ exp ( λ d ) 1 exp ( 2 λ d ) X n 1 2 ( λ 0 ) ( T 1 + T 2 + 2 T 3 + 2 T 4 + 2 T 5 ) 1 exp ( 2 d λ 0 ) 2 λ 0 1 ( 1 + 2 d λ ) exp ( 2 λ d ) ( 1 exp ( 2 λ d ) ) 2 X n 1 2 ( λ 0 ) T 6 ,
(5.31)

where

T 1 = t = 2 n x t 1 T ( β 0 β ) ( x t exp ( λ 0 d ) x t 1 ) , T 2 = t = 2 n ( x t exp ( λ 0 d ) x t 1 ) T ( β 0 β ) x t 1 , T 3 = t = 2 n ( exp ( λ 0 d ) exp ( λ d ) ) x t 1 T ( β 0 β ) x t 1 , T 4 = σ 1 exp ( 2 λ d ) 2 λ t = 2 n η t x t 1 , T 5 = t = 2 n e t 1 x t 1 , T 6 = σ 1 exp ( 2 λ d ) 2 λ t = 2 n η t ( x t exp ( λ d ) x t 1 ) .

For β N n β (A) and each A>0, we have

| ( β 0 β ) T x t | 2 = ( β 0 β ) T X n 1 2 ( λ 0 ) X n 1 2 ( λ 0 ) x t x t T X n T 2 ( λ 0 ) X n T 2 ( λ 0 ) ( β 0 β ) max 1 t n ( x t T X n 1 ( λ 0 ) x t ) ( β 0 β ) T X n ( λ 0 ) ( β 0 β ) A 2 max 1 t n ( x t T X n 1 ( λ 0 ) x t ) .
(5.32)

By (5.32) and Lemma 5.1, we have

sup β N n β ( A ) max 1 t n | ( β 0 β ) T x t |0,n,A>0.
(5.33)

Using the Cauchy-Schwarz inequality and (5.33), we obtain

u n T T 1 = t = 2 n u n T x t 1 T ( β 0 β ) ( x t exp ( λ 0 d ) x t 1 ) { t = 2 n ( x t 1 T ( β 0 β ) ) 2 } 1 2 { t = 2 n u n T ( x t exp ( λ 0 d ) x t 1 ) ( x t exp ( λ 0 d ) x t 1 ) T u n } 1 2 n max 1 t n | ( β 0 β ) T x t | = o ( n ) .
(5.34)

Using a similar argument as T 1 , we obtain that

u n T T 2 = o p ( n ).
(5.35)

By the Cauchy-Schwarz inequality and (5.33), (5.25), we get

u n T T 3 = t = 2 n ( exp ( λ 0 d ) exp ( λ d ) ) x t 1 T ( β 0 β ) u n T x t 1 { t = 2 n ( exp ( λ 0 d ) exp ( λ d ) ) 2 ( x t 1 T ( β 0 β ) ) 2 t = 2 n ( u n T x t 1 ) 2 } 1 2 C | λ 0 λ | { t = 2 n ( x t 1 T ( β 0 β ) ) 2 t = 2 n ( u n T x t 1 ) 2 } 1 2 C A Δ n ( θ 0 , σ 0 ) n o ( 1 ) o ( n ) = o ( n ) .
(5.36)

By (5.25), we have

Var ( u n T T 4 ) = σ 2 1 exp ( 2 λ d ) 2 λ t = 2 n ( u n T x t 1 ) 2 =o(n).
(5.37)

Thus, by the Chebychev inequality and (5.37),

u n T T 4 = o p ( n ).
(5.38)

By Lemma 5.1 and (2.3), we have

Var ( u n T T 5 ) = Var ( t = 2 n u n T x t e t 1 ) = σ 0 2 1 exp ( 2 λ 0 d ) 2 λ 0 Var { j = 1 n 1 ( t = j + 1 n u n T x t exp { λ 0 d ( t 1 j ) } ) η j } = σ 0 2 1 exp ( 2 λ 0 d ) 2 λ 0 j = 1 n 1 ( t = j + 1 n u n T x t exp { λ 0 d ( t 1 j ) } ) 2 σ 0 2 1 exp ( 2 λ 0 d ) 2 λ 0 max 2 t n | u n T x t | j = 1 n 1 ( t = j + 1 n exp { λ 0 d ( t 1 j ) } ) 2 C max 2 t n | u n T x t | n = o ( n ) .
(5.39)

Thus, by the Chebychev inequality and (5.39),

u n T T 5 = o p ( n ).
(5.40)

Using a similar argument as T 4 , we obtain

u n T T 6 = o p ( n ).
(5.41)

Thus (5.28) follows immediately from (5.31), (5.34)-(5.36), (5.38), (5.40) and (5.41).

For the third step, we will show that

W 22 p 0.
(5.42)

Write that

J = σ 2 2 Ψ n λ 2 σ 2 2 Ψ n λ 2 | θ = θ 0 = σ 2 ( n 1 ) 2 λ 2 2 σ 2 ( n 1 ) d 2 exp ( 2 λ d ) ( 1 exp ( 2 λ d ) ) 2 + 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) t = 2 n ε t 1 2 + 2 d exp ( λ d ) [ ( 2 d λ ) d λ exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 2 t = 2 n ( ε t exp ( λ d ) ε t 1 ) ε t 1 + 4 d exp ( 2 λ d ) [ d λ 1 + ( 1 + d λ ) exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 3 t = 2 n ( ε t exp ( λ d ) ε t 1 ) 2 σ 0 2 ( n 1 ) 2 λ 0 2 + 2 σ 0 2 ( n 1 ) d 2 exp ( 2 λ 0 d ) ( 1 exp ( 2 λ 0 d ) ) 2 2 d 2 λ 0 exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) t = 2 n e t 1 2 2 d exp ( λ 0 d ) [ ( 2 d λ 0 ) d λ 0 exp ( 2 λ 0 d ) ] ( 1 exp ( 2 λ 0 d ) ) 2 t = 2 n ( e t exp ( λ 0 d ) e t 1 ) e t 1 4 d exp ( 2 λ 0 d ) [ d λ 0 1 + ( 1 + d λ 0 ) exp ( 2 λ 0 d ) ] ( 1 exp ( 2 λ 0 d ) ) 3 t = 2 n ( e t exp ( λ 0 d ) e t 1 ) 2 .
(5.43)

By (3.3) and (3.4), we obtain that

T 1 = σ 2 ( n 1 ) 2 λ 2 σ 0 2 ( n 1 ) 2 λ 0 2 = n 1 2 λ 2 λ 0 2 ( σ 2 ( λ 0 2 λ 2 ) + λ 2 ( σ 2 σ 0 2 ) ) = o ( n )
(5.44)

and

T 2 = 2 σ 0 2 ( n 1 ) d 2 exp ( 2 λ 0 d ) ( 1 exp ( 2 λ 0 d ) ) 2 2 σ 2 ( n 1 ) d 2 exp ( 2 λ d ) ( 1 exp ( 2 λ d ) ) 2 = 2 d 2 ( n 1 ) ( 1 exp ( 2 λ 0 d ) ) 2 ( 1 exp ( 2 λ d ) ) 2 { σ 0 ( exp ( λ 0 d ) exp ( λ d ) ) + exp ( λ d ) ( σ 0 σ ) + exp ( λ d λ 0 d ) [ σ ( exp ( λ 0 d ) exp ( λ d ) ) + exp ( λ d ) ( σ σ 0 ) ] } ( σ 0 exp ( λ 0 d ) ( 1 exp ( 2 λ d ) ) + σ exp ( λ d ) ( 1 exp ( 2 λ 0 d ) ) ) = o ( n ) .
(5.45)

By (5.29), we have

T 3 = 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) t = 2 n ε t 1 2 2 d 2 λ 0 exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) t = 2 n e t 1 2 = 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) t = 2 n { ( x t T ( β 0 β ) ) 2 + 2 x t T ( β 0 β ) e t + e t 2 } 2 d 2 λ 0 exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) t = 2 n e t 1 2 = 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) t = 2 n ( x t T ( β 0 β ) ) 2 + 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) t = 2 n 2 x t T ( β 0 β ) e t + { 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) 2 d 2 λ 0 exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) } t = 2 n e t 1 2 = 2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) T 31 + 4 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) T 32 + T 33 .
(5.46)

By (5.32), it is easy to show that

T 31 =o(n).
(5.47)

By Lemma 5.1, (2.3) and (5.32), we have

Var ( T 32 ) = Var ( t = 2 n x t T ( β 0 β ) e t ) = Var { j = 1 n 1 ( t = j + 1 n x t T ( β 0 β ) exp { λ 0 d ( t 1 j ) } ) η j } = j = 1 n 1 ( t = j + 1 n x t T ( β 0 β ) exp { λ 0 d ( t 1 j ) } ) 2 max 2 t n | x t T ( β 0 β ) | j = 1 n 1 ( t = j + 1 n exp { λ 0 d ( t 1 j ) } ) 2 C max 2 t n | x t T ( β 0 β ) | n = o ( n ) .
(5.48)

Thus by the Chebychev inequality and (5.48),

T 32 = o p ( n ).
(5.49)

Write

2 d 2 λ exp ( 2 λ d ) 1 exp ( 2 λ d ) 2 d 2 λ 0 exp ( 2 λ 0 d ) 1 exp ( 2 λ 0 d ) = 2 d 2 ( 1 exp ( 2 λ d ) ) ( 1 exp ( 2 λ 0 d ) ) U ,
(5.50)

where

U=λexp(2λd) ( 1 exp ( 2 λ 0 d ) ) λ 0 exp(2 λ 0 d) ( 1 exp ( 2 λ d ) ) .

Note that

U = λ exp ( 2 λ d ) ( exp ( 2 λ d ) exp ( 2 λ 0 d ) ) + ( λ ( exp ( 2 λ d ) exp ( 2 λ 0 d ) ) + ( λ λ 0 ) exp ( 2 λ 0 d ) ) ( 1 exp ( 2 λ d ) ) = o ( 1 ) ,
(5.51)

so we have

T 33 =o(n).
(5.52)

Thus, by (5.46), (5.47), (5.49) and (5.52), we have

T 3 =o(n).
(5.53)

By (5.29), we have

T 4 = 2 d exp ( λ d ) [ ( 2 d λ ) d λ exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 2 t = 2 n ( ε t exp ( λ d ) ε t 1 ) ε t 1 2 d exp ( λ 0 d ) [ ( 2 d λ 0 ) d λ 0 exp ( 2 λ 0 d ) ] ( 1 exp ( 2 λ 0 d ) ) 2 t = 2 n ( e t exp ( λ 0 d ) e t 1 ) e t 1 = 2 d exp ( λ d ) [ ( 2 d λ ) d λ exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 2 σ 1 exp ( 2 λ d ) 2 λ t = 2 n x t 1 T ( β 0 β ) η t + { 2 d exp ( λ d ) [ ( 2 d λ ) d λ exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 2 σ 1 exp ( 2 λ d ) 2 λ 2 d exp ( λ 0 d ) [ ( 2 d λ 0 ) d λ 0 exp ( 2 λ 0 d ) ] ( 1 exp ( 2 λ 0 d ) ) 2 σ 1 exp ( 2 λ d ) 2 λ } t = 2 n η t e t 1 = T 41 + T 42 .
(5.54)

It is easy to show that

T 41 =o(n).
(5.55)

Note that { η t e t 1 , H t } is a martingale difference sequence, so we have

Var ( t = 2 n η t e t 1 ) = t = 2 n E e t 1 2 = Δ n ( θ 0 , σ 0 ).

Hence,

T 42 =o(n).
(5.56)

By (5.54)-(5.56), we have

T 4 =o(n).
(5.57)

It is easily proved that

T 5 = 4 d exp ( 2 λ d ) [ d λ 1 + ( 1 + d λ ) exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 3 t = 2 n ( ε t exp ( λ d ) ε t 1 ) 2 4 d exp ( 2 λ 0 d ) [ d λ 0 1 + ( 1 + d λ 0 ) exp ( 2 λ 0 d ) ] ( 1 exp ( 2 λ 0 d ) ) 3 t = 2 n ( e t exp ( λ 0 d ) e t 1 ) 2 = { 4 d exp ( 2 λ d ) [ d λ 1 + ( 1 + d λ ) exp ( 2 λ d ) ] ( 1 exp ( 2 λ d ) ) 3 σ 1 exp ( 2 λ d ) 2 λ 4 d exp ( 2 λ 0 d ) [ d λ 0 1 + ( 1 + d λ 0 ) exp ( 2 λ 0 d ) ] ( 1 exp ( 2 λ 0 d ) ) 3 σ 0 1 exp ( 2 λ 0 d ) 2 λ 0 } t = 2 n η t 2 = o ( n ) .
(5.58)

Hence, (5.42) follows immediately from (5.43)-(5.45), (5.53), (5.57) and (5.58). This completes the proof of (5.11) from (5.17), (5.23), (5.28) and (5.42).

It is well known that λ ( 1 exp ( 2 d λ 0 ) ) λ 0 ( 1 exp ( 2 d λ ) ) 1 as n. To prove (5.12), we need to show that

σ 2 2 Ψ n λ 2 | θ = θ 0 Δ n ( θ 0 , σ 0 ) p 1,n.

This follows immediately from (2.20) and the Markov inequality.

Finally, we will prove (5.13). By (5.11) and (5.12), we have

D n 1 2 F(θ) D n T 2 p I m ,n
(5.59)

uniformly in θ N n (A) for each A>0. Thus, by Lemma 5.3,

λ min ( D n 1 2 F ( θ ) D n T 2 ) p 1,n.
(5.60)

This implies (5.13). □

Lemma 5.5 (Hall and Heyde [56])

Let { S n i , F n i ,1i k n ,n1} be a zero-mean, square-integrable martingale array with differences X n i , and let η 2 be an a.s. finite random variable. Suppose that i E{ X n i 2 I(| X n i |>ε)| F n , i 1 } p 0 for all ε0, and i E{ X n i 2 | F n , i 1 } p η 2 . Then

S n k n = i X n i D Z,

where the r.v. Z has the characteristic function E{exp( 1 2 η 2 t 2 )}.

6 Proof of theorems

Proof of Theorem 3.1 Take A>0, let

M n (A)= { θ R m + 1 : ( θ θ 0 ) T D n ( θ θ 0 ) = A 2 }
(6.1)

be the boundary of N n (A), and let θ M n (A). Using (2.19) and the Taylor expansion, for each σ 2 >0, we have

Ψ n ( θ , σ 2 ) = Ψ n ( θ 0 , σ 2 ) + ( θ θ 0 ) T Ψ n ( θ 0 , σ 2 ) θ + 1 2 ( θ θ 0 ) T 2 Ψ n ( θ 0 , σ 2 ) θ θ T ( θ θ 0 ) = 1 σ 2 Ψ n ( θ 0 , σ 2 ) + ( θ θ 0 ) T S n ( θ 0 ) 1 2 σ 2 ( θ θ 0 ) T F n ( θ ˜ ) ( θ θ 0 ) ,
(6.2)

where θ ˜ =aθ+(1a) θ 0 for some 0a1.

Let Q n (θ)= 1 2 ( θ θ 0 ) T F n ( θ ˜ )(θ θ 0 ) and v n (θ)= 1 A D n T 2 (θ θ 0 ). Take c>0 and θ M n (A), and by (6.2), we obtain that

P { Ψ n ( θ , σ 2 ) Ψ n ( θ 0 , σ 2 )  for some  θ M n ( A ) } P { ( θ θ 0 ) T S n ( θ 0 ) Q n ( θ ) , Q n ( θ ) > c A 2  for some  θ M n ( A ) } + P { Q n ( θ ) c A 2  for some  θ M n ( A ) } P { v n T ( θ ) D n 1 2 S n ( θ 0 ) > c A  for some  θ M n ( A ) } + P { v n T ( θ ) D n 1 2 F n ( θ ˜ ) D n T 2 v n ( θ ) c  for some  θ M n ( A ) } P { | D n 1 2 S n ( θ 0 ) | > c A } + P { inf θ N n ( A ) λ min ( D n 1 2 F n ( θ ˜ ) D n T 2 ) c } .
(6.3)

By Lemma 5.2 and the Chebychev inequality, we obtain

P { | D n 1 2 S n ( θ 0 ) | > c A } Var ( D n 1 2 S n ( θ 0 ) ) c 2 A 2 = σ 0 2 c 2 A 2 .
(6.4)

Let A, then c0, and using (5.13), we have

P { inf φ N n ( A ) λ min ( D n 1 2 F n ( θ ˜ ) D n T 2 ) c } 0.
(6.5)

By (6.3)-(6.5), we have

lim A lim inf n P { Ψ n ( θ , σ 2 ) < Ψ n ( θ 0 , σ 2 )  for all  θ M n ( A ) } =1.
(6.6)

By Lemma 5.3, λ min ( X n ( θ 0 )) as n. Hence λ min ( D n ). Moreover, from (5.13), we have

inf θ N n ( A ) λ min ( F n ( θ ) ) p .

This implies that Ψ n (θ, σ 2 ) is concave on N n (A). Noting this fact and (6.6), we get

lim A lim inf n P { sup θ M n ( A ) Ψ n ( θ , σ 2 ) < Ψ n ( θ 0 , σ 2 ) , Ψ n ( θ , σ 2 )  is concave on  N n ( A ) } = 1 .
(6.7)

On the event in the brackets, the continuous function Ψ n (θ, σ 2 ) has a unique maximum in θ over the compact neighborhood N n (A). Hence

lim A lim inf n P { S n ( θ ˆ n ( A ) ) = 0  for a unique  θ ˆ n ( A ) N n ( A ) } =1.

Moreover, there is a sequence A n such that θ ˆ n = θ ˆ ( A n ) satisfies

lim inf n P { S n ( θ ˆ n ) = 0  and  θ ˆ n  maximizes  Ψ n ( θ , σ 2 )  uniquely in  N n ( A ) } =1.

This θ ˆ n =( β ˆ n , λ ˆ n ) is a QML estimator for θ 0 . It is clearly consistent, and

lim A lim inf n P { θ ˆ n N n ( A ) } =1.

Since θ ˆ n =( β ˆ n , λ ˆ n ) are ML estimators for θ 0 , σ ˆ n 2 is an ML estimator for σ 0 2 from (2.9).

To complete the proof, we will show that σ ˆ n 2 σ 0 2 as n. If θ ˆ n N n (A), then β ˆ n N n β (A) and λ ˆ n N n λ (A).

By (2.12) and (2.1), we have

ε ˆ t exp( λ ˆ n d) ε ˆ t 1 = ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n )+ ( e t exp ( λ ˆ n d ) e t 1 ) .
(6.8)

By (2.9), (2.11) and (6.8), we have

( n 1 ) σ ˆ n 2 = 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) 2 = 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) { ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) + ( e t exp ( λ ˆ n d ) e t 1 ) } = 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) + 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) ( e t exp ( λ ˆ n d ) e t 1 ) = 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) ( e t exp ( λ ˆ n d ) e t 1 ) .
(6.9)

From (6.8), it follows that

t = 2 n { ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) } 2 = t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) 2 2 t = 2 n ( ε ˆ t exp ( λ ˆ n d ) ε ˆ t 1 ) ( e t exp ( λ ˆ n d ) e t 1 ) + t = 2 n ( e t exp ( λ ˆ n d ) e t 1 ) 2 .
(6.10)

From (2.2), we get

t = 2 n ( e t exp ( λ ˆ n d ) e t 1 ) 2 = t = 2 n ( exp ( λ 0 d ) e t 1 + σ 0 1 exp ( 2 λ 0 d ) 2 λ 0 η t exp ( λ ˆ n d ) e t 1 ) 2 = σ 0 2 1 exp ( 2 λ 0 d ) 2 λ 0 t = 2 n η t 2 + t = 2 n ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) 2 e t 1 2 + 2 σ 0 1 exp ( 2 λ 0 d ) 2 λ 0 t = 2 n ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) η t e t 1 .
(6.11)

By (6.9)-(6.11), we have

( n 1 ) σ ˆ n 2 = 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( e t exp ( λ ˆ n d ) e t 1 ) 2 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) ) 2 = 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) σ 0 2 1 exp ( 2 λ 0 d ) 2 λ 0 t = 2 n η t 2 + 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) 2 e t 1 2 + 2 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) σ 0 1 exp ( 2 λ 0 d ) 2 λ 0 t = 2 n ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) η t e t 1 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) ) 2 = T 1 + T 2 + 2 T 3 T 4 .
(6.12)

By the law of large numbers and λ ˆ n p λ, we have

1 n 1 T 1 = 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) 1 exp ( 2 λ 0 d ) 2 λ 0 σ 0 2 1 n 1 t = 2 n η t 2 p σ 0 2 2 λ n 1 exp ( 2 λ n d ) 1 exp ( 2 λ 0 d ) 2 λ 0 = σ 0 2 ( n ) .
(6.13)

By the Markov inequality, and noting that E T 2 C A 2 , we obtain

1 n 1 T 2 p 0(n).
(6.14)

Since {(exp( λ 0 d)exp( λ ˆ n d)) η t e t 1 , H t 1 } is a martingale difference sequence with

Var { ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) η t e t 1 } = ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) 2 E e t 1 2 ,

so we have

Var ( T 3 ) = t = 2 n E ( ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) η t e t 1 ) 2 = t = 2 n ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) 2 E e t 1 2 C ( λ 0 λ ˆ n ) 2 t = 2 n E e t 1 2 C A 2 .
(6.15)

By the Chebychev inequality, we have

1 n 1 T 3 p 0(n).
(6.16)

By (5.33), we have

T 4 = t = 2 n ( ( x t T ( β 0 β ˆ n ) exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) ) 2 2 t = 2 n ( x t T ( β 0 β ˆ n ) ) 2 + t = 2 n ( exp ( λ ˆ n d ) x t 1 T ( β 0 β ˆ n ) ) 2 = o ( n ) .
(6.17)

From (6.12)-(6.14), (6.16) and (6.17), we have σ ˆ n 2 σ 0 2 .

We therefore complete the proof of Theorem 3.1. □

Proof of Theorem 3.2 It is easy to know that S n ( θ ˆ n )=0 and F n ( θ ˆ n ) is nonsingular from Theorem 3.1. By the Taylor expansion, we have

0= S n ( θ ˆ n )= S n ( θ 0 ) F n ( θ ˜ n )( θ ˆ n θ 0 ).
(6.18)

Since θ ˆ n N n (A), also θ ˜ n N n (A). By (5.11), we have

F n ( θ ˜ n )= D n 1 2 ( Φ n + A ˜ n ) D n T 2 ,
(6.19)

where A ˜ n is a symmetric matrix with A ˜ n p 0. By (6.18) and (6.19), we have

D n T 2 ( θ ˆ n θ 0 )= D n T 2 F n 1 ( θ ˜ n ) S n ( θ 0 )= ( Φ n + A ˜ n ) 1 D n 1 2 S n ( θ 0 ).
(6.20)

Similar to (6.20), we have

F n ( θ ˆ n ) = D n 1 2 ( Φ n + A ˆ n ) D n T 2 = ( D n 1 2 ( Φ n + A ˆ n ) 1 2 ) ( ( Φ n + A ˆ n ) T 2 D n T 2 ) = F n 1 2 ( θ ˆ n ) F n T 2 ( θ ˆ n ) .
(6.21)

Here A ˆ n p 0. By (6.20), (6.21), and noting that σ ˆ n 2 p σ 0 2 and D n 1 2 S n ( θ 0 )= O p (1), we obtain that

F n T 2 ( θ ˆ n ) ( θ ˆ n θ 0 ) / σ ˆ n = ( Φ n + A ˆ n ) 1 2 ( Φ n + A ˜ n ) 1 D n 1 2 S n ( θ 0 ) / σ ˆ n = Φ n 1 2 D n 1 2 S n ( θ 0 ) / σ 0 + o p ( 1 ) .
(6.22)

From (2.7) and (2.8), we have

S n ( θ 0 ) σ 0 = 2 λ 0 1 exp ( 2 λ 0 d ) { t = 2 n η t ( x t exp ( λ 0 d ) x t 1 ) , d exp ( λ 0 d ) t = 2 n η t e t 1 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 4 λ 0 2 t = 2 n ( η t 2 1 ) } .
(6.23)

From (5.14) and (5.15), we have

Φ n 1 2 D n 1 2 = ( ( λ ( 1 exp ( 2 d λ 0 ) ) λ 0 ( 1 exp ( 2 d λ ) ) ) 1 2 I d 0 0 Δ n ( θ 0 , σ 0 ) σ 2 2 Ψ n λ 2 | θ = θ 0 ) ( ( 2 λ 0 1 exp ( 2 d λ 0 ) ) 1 2 X n 1 2 ( λ 0 ) 0 0 1 Δ n ( θ 0 , σ 0 ) ) = ( ( 2 λ 1 exp ( 2 d λ ) ) 1 2 X n 1 2 ( λ 0 ) 0 0 1 σ 2 2 Ψ n λ 2 | θ = θ 0 ) .
(6.24)

By (6.23) and (6.24), we have

Φ n 1 2 D n 1 2 S n ( θ 0 ) / σ 0 = 2 λ 0 1 exp ( 2 λ 0 d ) { ( 2 λ 1 exp ( 2 d λ ) ) 1 2 t = 2 n η t X n 1 2 ( θ 0 ) ( x t exp ( λ 0 d ) x t 1 ) , 1 σ 2 2 Ψ n λ 2 | θ = θ 0 [ d exp ( λ 0 d ) t = 2 n η t e t 1 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 4 λ 0 2 t = 2 n ( η t 2 1 ) ] } .
(6.25)

Let u R d with |u|=1, and

a t n =u ( 2 λ 1 exp ( 2 d λ ) ) 1 2 X n 1 2 ( λ 0 ) ( x t exp ( λ 0 d ) x t 1 ) .

Then max 2 t n a t n =o(1), and we will consider the limiting distribution of the following 2-vector

2 λ 0 1 exp ( 2 λ 0 d ) { t = 2 n a t n η t , 1 σ 2 2 Ψ n λ 2 | θ = θ 0 [ d exp ( λ 0 d ) t = 2 n η t e t 1 + 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 4 λ 0 2 t = 2 n ( η t 2 1 ) ] } .
(6.26)

Note that

σ 2 2 Ψ n λ 2 | θ = θ 0 = O p ( Δ n ( θ 0 , σ 0 ) ) = O p (n).

Hence, by the Cramer-Wold device, it will suffice to find the asymptotic distribution of the following random

2 λ 0 1 exp ( 2 λ 0 d ) t = 2 n { u 1 a t n η t u 2 Δ n ( θ 0 , σ 0 ) [ d exp ( λ 0 d ) η t e t 1 + 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 4 λ 0 2 ( η t 2 1 ) ] } = t = 2 n ζ t ,
(6.27)

where ( u 1 , u 2 ) R 2 with u 1 2 + u 2 2 =1. Note that

E { ζ t | H t 1 } = 2 λ 0 1 exp ( 2 λ 0 d ) { u 1 a t n E ( η t ) u 2 Δ n ( θ 0 , σ 0 ) [ d exp ( λ 0 d ) E ( η t ) e t 1 + 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 4 λ 0 2 E ( η t 2 1 ) ] } = 0 , a.s. ,
(6.28)

so the sums in (6.27) are partial sums of a martingale triangular array to H t , and we will verify the Lindeberg conditions for their convergence to normality.

By (6.27), and noting that E η t 3 =0, E η t 4 =3 and λ N n λ (A), we have

t = 2 n E ( ζ t 2 | H t 1 ) = 2 λ 0 1 exp ( 2 λ 0 d ) { u 1 2 t = 2 n a t n 2 + u 2 2 1 Δ n ( θ 0 , σ 0 ) [ d 2 exp ( 2 λ 0 d ) t = 2 n e t 1 2 + 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 2 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 2 16 λ 0 2 t = 2 n E ( η t 2 1 ) 2 ] 2 t = 2 n u 1 u 2 n d exp ( λ 0 d ) a t n e t 1 + 2 u 2 2 n d exp ( λ 0 d ) 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 4 λ 0 2 E ( η t ( η t 2 1 ) ) e t 1 2 u 1 a t n u 2 n 2 λ 0 1 exp ( 2 λ 0 d ) σ 0 [ 1 ( 1 + 2 d λ 0 ) exp ( 2 λ 0 d ) ] 4 λ 0 2 E ( η t ( η t 2 1 ) ) } = u 1 2 u 2 2 λ 0 1 exp ( 2 λ 0 d ) ( 2 λ 1 exp ( 2 d λ ) ) 1 + u 2 2 + o p ( 1 ) + 0 + 0 = u 1 2 + u 2 2 + o p ( 1 ) = 1 + o p ( 1 ) .
(6.29)

Let a ˜ t n =min{ a t n , 1 Δ n ( θ 0 , σ 0 ) } and ζ t = a ˜ t n ζ ˜ t . Then a ˜ t n =o(1).

For any c>0,

t = 2 n E { ζ t 2 I ( | ζ t | > c ) | H t 1 } = t = 2 c y 2 d P { | a ˜ t n ζ ˜ t | y | H t 1 } = t = 2 n a ˜ t n 2 c a ˜ t n y 2 d p { | ζ ˜ t | y | H t 1 } = o ( 1 ) t = 2 n a ˜ t n 2 = o ( 1 ) O p ( 1 ) 0 , n .
(6.30)

This verifies the Lindeberg conditions, and by Lemma 5.5, we have

t = 2 n ζ t D N(0,1).

Thus we complete the proof of Theorem 3.2. □

Proof of Theorem 4.1 Note that λ ˆ 0 n λ 0 , λ ˆ n λ 0 . Similarly to the proof of Theorem 4.1(3) in Maller [55], by (6.12) and Theorem 3.2, we have

d ˜ ( n ) = 1 σ 0 2 { ( 2 λ ˆ 0 n 1 exp ( 2 λ ˆ 0 n d ) 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) ) σ 0 2 1 exp ( 2 λ 0 d ) 2 λ 0 t = 2 n η t 2 + t = 2 n { 2 λ ˆ 0 n 1 exp ( 2 λ ˆ 0 n d ) ( exp ( λ 0 d ) exp ( λ ˆ 0 n d ) ) 2 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) 2 } e t 1 2 + 2 σ 0 1 exp ( 2 λ 0 d ) 2 λ 0 t = 2 n { 2 λ ˆ 0 n 1 exp ( 2 λ ˆ 0 n d ) ( exp ( λ 0 d ) exp ( λ ˆ 0 n d ) ) 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) ( exp ( λ 0 d ) exp ( λ ˆ n d ) ) } η t e t 1 + 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) ) 2 } = 1 σ 0 2 2 λ ˆ n 1 exp ( 2 λ ˆ n d ) t = 2 n ( ( x t exp ( λ ˆ n d ) x t 1 ) T ( β 0 β ˆ n ) ) 2 + o ( 1 ) D χ 2 ( m ) .
(6.31)

 □

7 Empirical examples

In the section, we consider two empirical examples. The first one (β is a one-dimensional unknown parameter, namely m=1) is water flowing in the Kootenay River in January, which is taken from Hampel et al. [[6], p.310]. The second one (β is a 4-dimensional unknown parameter, namely m=4) is the consumption of spirits in the United Kingdom, which is taken from Fuller [48].

7.1 Water flowing in the Kootenay river

By the ordinary least squares method, we obtain that

y ˆ t =9.51371+0.47476 x t + ε ˆ t
(7.1)

and

ε t =0.2077 ε t 1 + η t ,t=1,2,,13,
(7.2)

where η t is a sequence of uncorrelated (0, 1.5013 2 ) random variables.

By the Huber-Dutter (HD) method, we obtain the following model (see Hu [10]):

y ˆ t =9.51371+0.4745 x t + ε ˆ t
(7.3)

and

ε t =0.3024 ε t 1 + η t ,
(7.4)

where η t is a sequence of uncorrelated (0, 1.0988 2 ) random variables.

By the ML method (take d=1 and starting values for λ ( 0 ) =1, ( σ 2 ) ( 0 ) =1.5, β ( 0 ) =0.5; here we use pattern search algorithms), we obtain the following model:

y ˆ t =9.51371+0.48039 x t + ε ˆ t
(7.5)

and

ε t =exp(1.80089) ε t 1 +0.5184 η t ,
(7.6)

where η t is a sequence of uncorrelated (0,1) random variables.

By model (1.3), we obtain a general process { y t } satisfying the following SDE:

d( y t 9.51371)=(1.3455 x t +9.51371 y t )dt+0.9976d B t .
(7.7)

Since 1.5013 2 > 1.0988 2 > 0.5184 2 , our results excel the results of HD and the least squares method in mean squares error (MSE).

By (4.7), we obtain d ˜ (13)=362.4137>6.63= χ 1 0.01 2 (1). It is shown that β0 at the significant level α=0.01. Thus we should apply the linear regression model (1.1) with Ornstein-Uhlenbeck process instead of only the Ornstein-Uhlenbeck process for the data.

It is shown that our estimation method and testing approach are valid in the case of m=1. For a multidimensional parameter β, it is true in the following example.

7.2 Consumption of spirits in the UK

We will use the data studied by Fuller [48]. The data pertain to the consumption of spirits in the United Kingdom from 1870 to 1983. The dependent variable y t is the annual per capita consumption of spirits in the United Kingdom. The explanatory variables x t 1 and x t 2 are per capita income and price of spirits, respectively, both deflated by a general price index. All data are in logarithms. The model suggested by Prest can be written as follows:

y t = β 0 + β 1 x t 1 + β 2 x t 2 + β 3 x t 3 + β 4 x t 4 + ε t ,
(7.8)

where 1869 is the origin for t, x t 3 = t 100 , x t 4 = ( t 35 ) 2 10 4 , and assume that ε t is a stationary time series.

Fuller [48] obtained the estimated generalized least squares equation

y ˆ t =2.36+0.72 x t 1 0.80 x t 2 0.81 x t 3 0.92 x t 4
(7.9)

and

ε t =0.7633 ε t 1 + η t ,

where η t is a sequence of uncorrelated (0,0.000417) random variables.

Take d=1 and starting values for

λ ( 0 ) =0.3, ( σ 2 ) ( 0 ) =0.0004, β ( 0 ) = ( 0.72 , 0.80 , 0.81 , 0.92 ) T .

Using our method, we obtain the following models:

y ˆ t =2.36+0.73251 x t 1 0.80024 x t 2 0.86286 x t 3 0.60774 x t 4
(7.10)

and

ε t =exp(0.25319) ε t 1 +0.0196 η t ,
(7.11)

where η t is a sequence of uncorrelated (0,1) random variables; or

d ε t =0.25319 ε t dt+0.0221d B t .
(7.12)

Since 0.000417>0.00038461, our results excel the results of Fuller [48] in MSE.

By (4.7), we obtain d ˜ (69)=100.2777>13.3= χ 1 0.01 2 (4). It is shown that β0 at the significant level α=0.01.

References

  1. Wang XM, Zhou W: Bootstrap approximation to the distribution of M -estimates in a linear model. Acta Math. Sin. Engl. Ser. 2004,20(1):93-104. 10.1007/s10114-003-0246-6

    Article  MathSciNet  MATH  Google Scholar 

  2. Anatolyev S: Inference in regression models with many regressors. J. Econom. 2012, 170: 368-382. 10.1016/j.jeconom.2012.05.011

    Article  MathSciNet  Google Scholar 

  3. Bai ZD, Guo M: A paradox in least-squares estimation of linear regression models. Stat. Probab. Lett. 1999, 42: 167-174. 10.1016/S0167-7152(98)00205-3

    Article  MathSciNet  MATH  Google Scholar 

  4. Chen X: Consistency of LS estimates of multiple regression under a lower order moment condition. Sci. China Ser. A 1995,38(12):1420-1431.

    MathSciNet  MATH  Google Scholar 

  5. Gil GR, Engela B, Norberto C, Ana C: Least squares estimation of linear regression models for convex compact random sets. Adv. Data Anal. Classif. 2007, 1: 67-81. 10.1007/s11634-006-0003-7

    Article  MathSciNet  MATH  Google Scholar 

  6. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA: Robust Statistics. Wiley, New York; 1986.

    MATH  Google Scholar 

  7. Cui H: On asymptotics of t -type regression estimation in multiple linear model. Sci. China Ser. A 2004,47(4):628-639. 10.1360/03ys0020

    Article  MathSciNet  MATH  Google Scholar 

  8. Durbin L: A note on regression when there is extraneous information about one of the coefficients. J. Am. Stat. Assoc. 1953, 48: 799-808. 10.1080/01621459.1953.10501201

    Article  MATH  Google Scholar 

  9. Li Y, Yang H: A new stochastic mixed ridge estimator in linear regression model. Stat. Pap. 2010,51(2):315-323. 10.1007/s00362-008-0169-5

    Article  MathSciNet  MATH  Google Scholar 

  10. Hu HC:Asymptotic normality of Huber-Dutter estimators in a linear model with AR(1) processes. J. Stat. Plan. Inference 2013,143(3):548-562. 10.1016/j.jspi.2012.08.012

    Article  MathSciNet  MATH  Google Scholar 

  11. Wu WB: M -Estimation of linear models with dependent errors. Ann. Stat. 2007,35(2):495-521. 10.1214/009053606000001406

    Article  MathSciNet  MATH  Google Scholar 

  12. Fox R, Taqqu MS: Large sample properties of parameter estimates for strongly dependent stationary Gaussian time series. Ann. Stat. 1986, 14: 517-532. 10.1214/aos/1176349936

    Article  MathSciNet  MATH  Google Scholar 

  13. Giraitis L, Surgailis D: A central limit theorem for quadratic forms in strongly dependent linear variables and its application to asymptotic normality of Whittle’s estimate. Probab. Theory Relat. Fields 1990, 86: 87-104. 10.1007/BF01207515

    Article  MathSciNet  MATH  Google Scholar 

  14. Koul HL, Surgailis D: Asymptotic normality of the Whittle estimator in linear regression models with long memory errors. Stat. Inference Stoch. Process. 2000, 3: 129-147. 10.1023/A:1009999607588

    Article  MathSciNet  MATH  Google Scholar 

  15. Shiohama T, Taniguchi M: Sequential estimation for time series regression models. J. Stat. Plan. Inference 2004, 123: 295-312. 10.1016/S0378-3758(03)00153-8

    Article  MathSciNet  MATH  Google Scholar 

  16. Fan J: Moderate deviations for M -estimators in linear models with ϕ -mixing errors. Acta Math. Sin. Engl. Ser. 2012,28(6):1275-1294. 10.1007/s10114-011-9188-6

    Article  MathSciNet  MATH  Google Scholar 

  17. Ornstein LS, Uhlenbeck GE: On the theory of Brownian motion. Phys. Rev. 1930, 36: 823-841. 10.1103/PhysRev.36.823

    Article  MATH  Google Scholar 

  18. Janczura J, Orzel S, Wylomanska A: Subordinated α -stable Ornstein-Uhlenbeck process as a tool for financial data description. Physica A 2011, 390: 4379-4387. 10.1016/j.physa.2011.07.007

    Article  Google Scholar 

  19. Debbasch F, Mallick K, Rivet JP: Relativistic Ornstein-Uhlenbeck process. J. Stat. Phys. 1997, 88: 945-966.

    Article  MathSciNet  MATH  Google Scholar 

  20. Gillespie D: Exact numerical simulation of the Ornstein-Uhlenbeck process and its integral. Phys. Rev. E 1996,54(2):2084-2091. 10.1103/PhysRevE.54.2084

    Article  MathSciNet  Google Scholar 

  21. Ditlevsen S, Lansky P: Estimation of the input parameters in the Ornstein-Uhlenbeck neuronal model. Phys. Rev. E 2005. Article ID 011907,71(1): Article ID 011907

    MATH  Google Scholar 

  22. Garbaczewski P, Olkiewicz R: Ornstein-Uhlenbeck-Cauchy process. J. Math. Phys. 2000,41(10):6843-6860. 10.1063/1.1290054

    Article  MathSciNet  MATH  Google Scholar 

  23. Plastino AR, Plastino A: Non-extensive statistical mechanics and generalized Fokker-Planck equation. Physica A 1995, 222: 347-354. 10.1016/0378-4371(95)00211-1

    Article  MathSciNet  Google Scholar 

  24. Fasen V: Statistical estimation of multivariate Ornstein-Uhlenbeck processes and applications to co-integration reserved. J. Econom. 2012. 10.1016/j.jeconom.2012.08.019

    Google Scholar 

  25. Yu J: Bias in the estimation of the mean reversion parameter in continuous time models. J. Econom. 2012, 169: 114-122. 10.1016/j.jeconom.2012.01.004

    Article  MathSciNet  Google Scholar 

  26. Geman H: Commodities and Commodity Derivatives. Wiley, Chichester; 2005.

    Google Scholar 

  27. Zhang B, Grzelak LA, Oosterlee CM: Efficient pricing of commodity options with early-exercise under the Ornstein-Uhlenbeck process. Appl. Numer. Math. 2012, 62: 91-111. 10.1016/j.apnum.2011.10.005

    Article  MathSciNet  MATH  Google Scholar 

  28. Rieder S: Robust parameter estimation for the Ornstein-Uhlenbeck process. Stat. Methods Appl. 2012. 10.1007/s10260-012-0195-2

    Google Scholar 

  29. Iacus S: Simulation and Inference for Stochastic Differential Equations. Springer, New York; 2008.

    Book  MATH  Google Scholar 

  30. Bishwal JPN: Uniform rate of weak convergence of the minimum contrast estimator in the Ornstein-Uhlenbeck process. Methodol. Comput. Appl. Probab. 2010, 12: 323-334. 10.1007/s11009-008-9099-x

    Article  MathSciNet  MATH  Google Scholar 

  31. Shimizu Y: Local asymptotic mixed normality for discretely observed non-recurrent Ornstein-Uhlenbeck processes. Ann. Inst. Stat. Math. 2012, 64: 193-211. 10.1007/s10463-010-0307-4

    Article  MathSciNet  MATH  Google Scholar 

  32. Zhang S, Zhang X: A least squares estimator for discretely observed Ornstein-Uhlenbeck processes driven by symmetric α -stable motions. Ann. Inst. Stat. Math. 2012. 10.1007/s10463-012-0362-0

    Google Scholar 

  33. Chronopoulou A, Viens FG: Estimation and pricing under long-memory stochastic volatility. Ann. Finance 2012, 8: 379-403. 10.1007/s10436-010-0156-4

    Article  MathSciNet  MATH  Google Scholar 

  34. Lin H, Wang J: Successful couplings for a class of stochastic differential equations driven by Levy processes. Sci. China Math. 2012,55(8):1735-1748. 10.1007/s11425-012-4387-x

    Article  MathSciNet  MATH  Google Scholar 

  35. Xiao W, Zhang W, Zhang X: Minimum contrast estimator for fractional Ornstein-Uhlenbeck processes. Sci. China Math. 2012,55(7):1497-1511. 10.1007/s11425-012-4386-y

    Article  MathSciNet  MATH  Google Scholar 

  36. Magdalinos T: Mildly explosive autoregression under weak and strong dependence. J. Econom. 2012, 169: 179-187. 10.1016/j.jeconom.2012.01.024

    Article  MathSciNet  Google Scholar 

  37. Andrews DWK, Guggenberger P:Asymptotics for LS, GLS, and feasible GLS statistics in an AR(1) model with conditional heteroskedasticity. J. Econom. 2012, 169: 196-210. 10.1016/j.jeconom.2012.01.017

    Article  MathSciNet  Google Scholar 

  38. Fan J, Yao Q: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York; 2005.

    MATH  Google Scholar 

  39. Berk KN: Consistent autoregressive spectral estimates. Ann. Stat. 1974, 2: 489-502. 10.1214/aos/1176342709

    Article  MathSciNet  MATH  Google Scholar 

  40. Goldenshluger A, Zeevi A: Non-asymptotic bounds for autoregressive time-series modeling. Ann. Stat. 2001, 29: 417-444. 10.1214/aos/1009210547

    Article  MathSciNet  MATH  Google Scholar 

  41. Liebscher E: Strong convergence of estimators in nonlinear autoregressive models. J. Multivar. Anal. 2003, 84: 247-261. 10.1016/S0047-259X(02)00022-2

    Article  MathSciNet  MATH  Google Scholar 

  42. Baran S, Pap G, Zuijlen MV: Asymptotic inference for unit roots in spatial triangular autoregression. Acta Appl. Math. 2007, 96: 17-42. 10.1007/s10440-007-9097-y

    Article  MathSciNet  MATH  Google Scholar 

  43. Distaso W: Testing for unit root processes in random coefficient autoregressive models. J. Econom. 2008, 142: 581-609. 10.1016/j.jeconom.2007.09.002

    Article  MathSciNet  Google Scholar 

  44. Harvill JL, Ray BK: Functional coefficient autoregressive models for vector time series. Comput. Stat. Data Anal. 2008, 50: 3547-3566.

    Article  MathSciNet  MATH  Google Scholar 

  45. Dehling H, Franke B, Kott T: Drift estimation for a periodic mean reversion process. Stat. Inference Stoch. Process. 2010, 13: 175-192. 10.1007/s11203-010-9045-8

    Article  MathSciNet  MATH  Google Scholar 

  46. Maller RA: Asymptotics of regressions with stationary and nonstationary residuals. Stoch. Process. Appl. 2003, 105: 33-67. 10.1016/S0304-4149(02)00263-6

    Article  MathSciNet  MATH  Google Scholar 

  47. Pere P:Adjusted estimates and Wald statistics for the AR(1) model with constant. J. Econom. 2000, 98: 335-363. 10.1016/S0304-4076(00)00023-3

    Article  MathSciNet  MATH  Google Scholar 

  48. Fuller WA: Introduction to Statistical Time Series. 2nd edition. Wiley, New York; 1996.

    MATH  Google Scholar 

  49. Chambers MJ: Jackknife estimation of stationary autoregressive models. J. Econom. 2012. 10.1016/j.jeconom.2012.09.003

    Google Scholar 

  50. Hamilton JD: Time Series Analysis. Princeton University Press, Princeton; 1994.

    MATH  Google Scholar 

  51. Brockwell PJ, Davis RA: Time Series: Theory and Methods. Springer, New York; 1987.

    Book  MATH  Google Scholar 

  52. Abadir KM, Lucas A: A comparison of minimum MSE and maximum power for the nearly integrated non-Gaussian model. J. Econom. 2004, 119: 45-71. 10.1016/S0304-4076(03)00155-6

    Article  MathSciNet  Google Scholar 

  53. Fan JQ, Jiang JC: Nonparametric inference with generalized likelihood ratio tests. Test 2007, 16: 409-444. 10.1007/s11749-007-0080-8

    Article  MathSciNet  MATH  Google Scholar 

  54. Rao CR: Linear Statistical Inference and Its Applications. Wiley, New York; 1973.

    Book  MATH  Google Scholar 

  55. Maller RA: Quadratic negligibility and the asymptotic normality of operator normed sums. J. Multivar. Anal. 1993, 44: 191-219. 10.1006/jmva.1993.1011

    Article  MathSciNet  MATH  Google Scholar 

  56. Hall P, Heyde CC: Martingale Limit Theory and Its Application. Academic Press, New York; 1980.

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of China (No. 41374017), and Science and Technology Research Projects of the Educational Department of Hubei Province (No. Q20142501).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiong Pan.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

Rights and permissions

Open Access  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, H., Pan, X. & Xu, L. Maximum likelihood estimators in linear regression models with Ornstein-Uhlenbeck process. J Inequal Appl 2014, 301 (2014). https://doi.org/10.1186/1029-242X-2014-301

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1029-242X-2014-301

Keywords