# Wasserstein bounds in CLT of approximative MCE and MLE of the drift parameter for Ornstein-Uhlenbeck processes observed at high frequency

## Abstract

This paper deals with the rate of convergence for the central limit theorem of estimators of the drift coefficient, denoted θ, for the Ornstein-Uhlenbeck process $$X := \{X_{t},t\geq 0\}$$ observed at high frequency. We provide an approximate minimum contrast estimator and an approximate maximum likelihood estimator of θ, namely $$\widetilde{\theta}_{n}:= {1}/{ (\frac{2}{n} \sum_{i=1}^{n}X_{t_{i}}^{2} )}$$, and $$\widehat{\theta}_{n}:= -{\sum_{i=1}^{n} X_{t_{i-1}} (X_{t_{i}}-X_{t_{i-1}} )}/{ (\Delta _{n} \sum_{i=1}^{n} X_{t_{i-1}}^{2} )}$$, respectively, where $$t_{i} = i \Delta _{n}$$, $$i=0,1,\ldots , n$$, $$\Delta _{n}\rightarrow 0$$. We provide Wasserstein bounds in the central limit theorem for $$\widetilde{\theta}_{n}$$ and $$\widehat{\theta}_{n}$$.

## 1 Introduction

Let $$X:= \{X_{t}, t \geq 0 \}$$ be the Ornstein-Uhlenbeck (OU) process driven by Brownian motion $$\{W_{t},t\geq 0 \}$$. More precisely, X is the solution of the following linear stochastic differential equation

$$X_{0}=0;\qquad dX_{t}=-\theta X_{t}\,dt+dW_{t}, \quad t\geq 0,$$
(1.1)

where $$\theta >0$$ is an unknown parameter.

The drift parametric estimation for the OU process (1.1) has been widely studied in the literature. There are several methods that can estimate the parameter θ in (1.1) such as maximum likelihood estimation, least squares estimation, and minimum contrast estimation; we refer to monographs [14, 15]. While for the study of the asymptotic distribution of the estimators of θ based on discrete observations of X, there is extensive literature, and only several works have been dedicated to the rates of weak convergence of the distributions of the estimators to the standard normal distribution.

From a practical point of view, in parametric inference, it is more realistic and interesting to consider asymptotic estimation for (1.1) based on discrete observations. Thus, let us assume that the process X given in (1.1) is observed equidistantly in time with the step size $$\Delta _{n}$$: $$t_{i}=i \Delta _{n}$$, $$i=0, \ldots , n$$, and $$T=n \Delta _{n}$$ denotes the length of the “observation window”. Here we are concerned with the approximate minimum contrast estimator (AMCE)

$$\widetilde{\theta}_{n}:=\frac{1}{\frac{2}{n} \sum_{i=1}^{n}X_{t_{i}}^{2}},$$

and the approximate maximum likelihood estimator (AMLE)

$$\widehat{\theta}_{n}:=- \frac{\sum_{i=1}^{n} X_{t_{i-1}} (X_{t_{i}}-X_{t_{i-1}} )}{\Delta _{n} \sum_{i=1}^{n} X_{t_{i-1}}^{2}},$$

which are discrete versions of the minimum contrast estimator (MCE) and the maximum likelihood estimator (MLE) defined as follows:

$$\bar{\theta}_{T}:= \frac{1}{\frac{2}{T}\int _{0}^{T} X_{s}^{2} \,\mathrm{d} s},\qquad \check{ \theta}_{T}=- \frac{\int _{0}^{T} X_{s} \,\mathrm{d} X_{s}}{\int _{0}^{T} X_{s}^{2} \,\mathrm{d} s}, \quad T\geq 0.$$

Recall that, for two random variables X and Y, the Wasserstein metric is given by

\begin{aligned} d_{W} ( X,Y ) :=\sup_{f\in \operatorname{Lip}(1)} \bigl\vert E \bigl[f(X)\bigr]-E \bigl[f(Y)\bigr] \bigr\vert , \end{aligned}

where $$\operatorname{Lip}(1)$$ is the set of all Lipschitz functions with the Lipschitz constant 1.

Rates of convergence in the central limit theorem of the MCE $$\bar{\theta}_{T}$$ and MLE $$\check{\theta}_{T}$$ under the Kolmogorov and Wasserstein distances have been studied as follows: There exist $$c, C>0$$ depending only on θ such that

\begin{aligned}& \sup_{x \in \mathbb{R}} \biggl\vert P \biggl(\sqrt{ \frac{T}{2 \theta}} (\bar{\theta}_{T}-\theta ) \leqslant x \biggr)-P (\mathcal{N}\leqslant x ) \biggr\vert \leq \frac{C}{\sqrt{T}}, \quad \text{see [3, Theorem 2.5],}\\& d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\bar{ \theta}_{T}-\theta ), N \biggr) \leq \frac{C}{\sqrt{T}}, \quad \text{see [7, Theorem 5.4]},\\& \frac{c}{\sqrt{T}}\leq \sup_{x \in \mathbb{R}} \biggl\vert P \biggl( \sqrt{ \frac{T}{2 \theta}} (\check{\theta}_{T}-\theta ) \leqslant x \biggr)-P (\mathcal{N}\leqslant x ) \biggr\vert \leq \frac{C}{\sqrt{T}}, \quad \text{see [12, Theorems 1 and 2],}\\& d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\check{ \theta}_{T}-\theta ), \mathcal{N} \biggr) \leq \frac{C}{\sqrt{T}}, \quad \text{see [8, Theorem 1] for fixed N=1}, \end{aligned}

where $$\mathcal{N}\sim \mathcal{N} (0,1 )$$ denotes a standard normal random variable.

The purpose of this manuscript is to derive upper bounds of the Wasserstein distance for the rates of convergence of the distribution of the AMCE $$\widetilde{\theta}_{n}$$ and the AMLE $$\widehat{\theta}_{n}$$. These estimators are unbiased, and we show that they are consistent and admit a central limit theorem as $$\Delta _{n}\rightarrow 0$$ and $$T\rightarrow \infty$$. Moreover, we bound the rate of convergence to the normal distribution in terms of the Wasserstein distance.

Note that the papers  and  provided explicit upper bounds for the Kolmogorov distance for the rates of convergence of the distribution of $$\widetilde{\theta}_{n}$$ and $$\widehat{\theta}_{n}$$, respectively. On the other hand,  provided Wasserstein bounds in central limit theorem for $$\widetilde{\theta}_{n}$$. Let us describe what is proved in this direction:

• Theorem 2.1 in  shows that there exists $$C>0$$ depending on θ such that

\begin{aligned} \sup_{x \in \mathbb{R}} \biggl\vert P \biggl(\sqrt{ \frac{T}{2 \theta}} (\widetilde{\theta}_{n}-\theta ) \leqslant x \biggr)-P (\mathcal{N}\leqslant x ) \biggr\vert \leq C \max \biggl(\sqrt{ \frac{\log T}{T}},\frac{T^{4}}{n^{2}\log T} \biggr). \end{aligned}
(1.2)
• Theorem 2.3 in  proves that there exists $$C>0$$ depending on θ such that

\begin{aligned} \sup_{x \in \mathbb{R}} \biggl\vert P \biggl(\sqrt{ \frac{T}{2 \theta}} (\widehat{\theta}_{n}-\theta ) \leqslant x \biggr)-P (\mathcal{N}\leqslant x ) \biggr\vert \leq C \max \biggl(\sqrt{ \frac{\log T}{T}},\frac{T^{2}}{n\log T} \biggr). \end{aligned}
(1.3)
• Theorem 5.4 in  establishes that there exists $$C>0$$ depending on θ such that

\begin{aligned} d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widetilde{ \theta}_{n}- \theta ), \mathcal{N} \biggr) \leq C\max \biggl( \frac{1}{\sqrt{T}},\sqrt{\frac{T^{2}}{n}} \biggr). \end{aligned}
(1.4)

### Remark 1.1

Note that in [2, Theorem 2.1], [4, Theorem 2.3], and [7, Theorem 5.4], the asymptotic normality of the distribution of $$\widetilde{\theta}_{n}$$ and $$\widehat{\theta}_{n}$$ need $$n\Delta _{n}^{2}=\frac{T^{2}}{n} \rightarrow 0$$ and $$T \rightarrow \infty$$. However, Theorem 3.6 and Theorem 4.1, which are stated and proved below, show that, respectively, the asymptotic normality of the distribution of $$\widetilde{\theta}_{n}$$ and $$\widehat{\theta}_{n}$$ only need $$\Delta _{n}=\frac{T}{n} \rightarrow 0$$ and $$T \rightarrow \infty$$.

The aim of the present paper is to provide new explicit bounds for the rate of convergence in the CLT of the estimators $$\widetilde{\theta}_{n}$$ and $$\widehat{\theta}_{n}$$ under the Wasserstein metric as follows: There exists a constant $$C>0$$ such that, for all $$n\geq 1$$, $$T>0$$,

\begin{aligned} d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widetilde{ \theta}_{n}- \theta ), \mathcal{N} \biggr) \leq C \max \biggl( \frac{1}{\sqrt{T}},\frac{T^{2}}{n^{2}} \biggr), \end{aligned}
(1.5)

see Theorem 3.6, and

\begin{aligned} d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widehat{ \theta}_{n}- \theta ),\mathcal{N} \biggr) \leq &C\max \biggl( \frac{1}{\sqrt{T}} , \sqrt{\frac{T^{3}}{n^{2}}} \biggr), \end{aligned}
(1.6)

see Theorem 4.1.

### Remark 1.2

The estimates (1.5) and (1.6) show that we have improved the bounds on the error of normal approximation for $$\widetilde{\theta}_{n}$$ and $$\widehat{\theta}_{n}$$. In other words, it is clear that the obtained bounds in (1.5) and (1.6) are sharper than the bounds in (1.2), (1.3), and (1.4).

To finish this introduction, we note the general structure of this paper. Section 2 contains some preliminaries presenting the tools needed for the analysis of the Wiener space, including Wiener chaos calculus and Malliavin calculus. Upper bounds for the rates of convergence of the distribution of the AMCE $$\widetilde{\theta}_{n}$$ and the AMLE $$\widehat{\theta}_{n}$$ are provided in Sect. 3 and Sect. 4, respectively.

## 2 Preliminaries

This section gives a brief overview of some useful facts from the Malliavin calculus on Wiener space. Some of the results presented here are essential for the proofs in the present paper. For our purposes, we focus on special cases that are relevant to our setting and omit the general high-level theory. We direct the interested reader to [18, Chap. 1]and [16, Chap. 2].

The first step is to identify the general centered Gaussian process $$(Z_{t})_{\geq 0}$$ with an isonormal Gaussian process $$X = \{ X(h), h \in \mathcal{H}\}$$ for some Hilbert space $$\mathcal{H}$$, that is, X is a centered Gaussian family defined a common probability space $$(\Omega , \mathcal{F}, P)$$ satisfying, for every $$h_{1}, h_{2} \in \mathcal{H}$$, $$E [ X(h_{1}) X(h_{2}) ] = \langle h_{1}, h_{2} \rangle _{\mathcal{H}}$$.

One can define $$\mathcal{H}$$ as the closure of real-valued step functions on $$[0, \infty )$$ with respect to the inner product $$\langle \mathbf{1}_{[0, t]}, \mathbf{1}_{[0, s]} \rangle _{ \mathcal{H}}= E[ Z_{t} Z_{s}]$$. Note that $$X(\mathbf{1}_{[0, t]}) \overset{d}{=} Z_{t}$$.

The next step involves the multiple Wiener-Itô integrals. The formal definition involves the concepts of the Malliavin derivative and divergence. We refer the reader to [18, Chap. 1]and [16, Chap. 2]. For our purposes, we define the multiple Wiener-Itô integral $$I_{p}$$ via the Hermite polynomials $$H_{p}$$. In particular, for $$h \in \mathcal{H}$$ with $$\lVert h \rVert _{\mathcal{H}}= 1$$, and any $$p \geq 1$$,

\begin{aligned} H_{p}\bigl(X(h)\bigr) = I_{p} \bigl(h^{\otimes p}\bigr). \end{aligned}

For $$p = 1$$ and $$p = 2$$, we have the following:

\begin{aligned} H_{1}\bigl(X(\mathbf{1}_{[0, t]})\bigr) = & X(\mathbf{1}_{[0, t]}) = I_{1}( \mathbf{1}_{[0, t]}) \overset{d}{=} Z_{t} , \end{aligned}
(2.1)
\begin{aligned} H_{2}\bigl(X (\mathbf{1}_{[0, t]})\bigr) = & X(\mathbf{1}_{[0, t]})^{2} - E\bigl[X( \mathbf{1}_{[0, t]})^{2}\bigr] = I_{2}\bigl( \mathbf{1}_{[0, t]}^{\otimes 2}\bigr) \overset{d}{=} Z_{t}^{2} - E[Z_{t}]^{2}. \end{aligned}
(2.2)

Note also that $$I_{0}$$ can be taken to be the identity operator.

Some notation for Hilbert spaces. Let $$\mathcal{H}$$ be a Hilbert space. Given an integer $$q \geq 2$$, the Hilbert spaces $$\mathcal{H}^{\otimes q}$$ and $$\mathcal{H}^{\odot q}$$ correspond to the qth tensor product and qth symmetric tensor product of $$\mathcal{H}$$. If $$f \in \mathcal{H}^{\otimes q}$$ is given by $$f = \sum_{j_{1}, \ldots , j_{q}} a(j_{1}, \ldots , j_{q}) e_{j_{1}} \otimes \cdots e_{j_{q}}$$, where $$(e_{j_{i}})_{i \in [1, q]}$$ form an orthonormal basis of $$\mathcal{H}^{\otimes q}$$, then the symmetrization is given by

\begin{aligned} \tilde{f} = \frac{1}{q!} \sum_{\sigma} \sum_{j_{1}, \ldots , j_{q}} a(j_{1}, \ldots , j_{q}) e_{\sigma (j_{1})} \otimes \cdots e_{\sigma (j_{q})}, \end{aligned}

where the first sum runs over all permutations σ of $$\{1, \ldots , q\}$$. Then is an element of $$\mathcal{H}^{\odot q}$$. We also make use of the concept of contraction. The rth contraction of two tensor products $$e_{j_{1}} \otimes \cdots \otimes e_{j_{p}}$$ and $$e_{k_{1}} \otimes \cdots e_{k_{q}}$$ is an element of $$\mathcal{H}^{\otimes (p + q - 2r)}$$ given by

\begin{aligned} &(e_{j_{1}} \otimes \cdots \otimes e_{j_{p}}) \otimes _{r} (e_{k_{1}} \otimes \cdots \otimes e_{k_{q}}) \\ & \quad = \Biggl[ \prod_{\ell =1}^{r} \langle e_{j_{\ell }}, e_{k_{ \ell }} \rangle \Biggr] e_{j_{r+1}} \otimes \cdots \otimes e_{j_{q}} \otimes e_{k_{r+1}} \otimes \cdots \otimes e_{k_{q}}. \end{aligned}
(2.3)

Isometry property of integrals [16, Proposition 2.7.5] Fix integers $$p, q \geq 1$$ as well as $$f \in \mathcal{H}^{\odot p}$$ and $$g \in \mathcal{H}^{\odot q}$$.

\begin{aligned} E \bigl[ I_{q}(f) I_{q}(g) \bigr] = \textstyle\begin{cases} p! \langle f, g \rangle _{\mathcal{H}^{\otimes p}} & \text{if } p = q, \\ 0 & \text{otherwise.} \end{cases}\displaystyle \end{aligned}
(2.4)

Product formula [16, Proposition 2.7.10] Let $$p,q \geq 1$$. If $$f \in \mathcal{H}^{\odot p}$$ and $$g \in \mathcal{H}^{\odot q}$$ then

\begin{aligned} I_{p}(f) I_{q}(g) = \sum _{r = 0}^{p \wedge q} r! {\binom{p}{r}} { \binom{q}{r}} I_{p + q -2r}(f \widetilde{\otimes}_{r} g). \end{aligned}
(2.5)

Hypercontractivity in Wiener Chaos. For every $$q\geq 1$$, $${\mathcal{H}}_{q}$$ denotes the qth Wiener chaos of W, defined as the closed linear subspace of $$L^{2}(\Omega )$$ generated by the random variables $$\{H_{q}(W(h)),h\in {{\mathcal{H}}},\Vert h\Vert _{{\mathcal{H}}}=1\}$$, where $$H_{q}$$ is the qth Hermite polynomial. For any $$F \in \oplus _{l=1}^{q}{\mathcal{H}}_{l}$$ (i.e., in a fixed sum of Wiener chaoses), we have

$$\bigl( E \bigl[ \vert F \vert ^{p} \bigr] \bigr) ^{1/p}\leqslant c_{p,q} \bigl( E \bigl[ \vert F \vert ^{2} \bigr] \bigr) ^{1/2}\quad \text{for any }p\geq 2.$$
(2.6)

It should be noted that the constants $$c_{p,q}$$ above are known with some precision when F is a single chaos term: indeed, by [16, Corollary 2.8.14], $$c_{p,q}= ( p-1 ) ^{q/2}$$.

Optimal fourth moment theorem. Let N denote the standard normal law. Let a sequence $$X:X_{n}\in {\mathcal{H}}_{q}$$, such that $$EX_{n}=0$$ and $$\operatorname{Var} [ X_{n} ] =1$$, and assume that $$X_{n}$$ converges to a normal law in distribution, which is equivalent to $$\lim_{n}E [ X_{n}^{4} ] =3$$. Then we have the optimal estimate for total variation distance $$d_{\mathrm{TV}} ( X_{n},\mathcal{N} )$$, known as the optimal 4th moment theorem, proved in . This optimal estimate also holds with the Wasserstein distance $$d_{W} ( X_{n},\mathcal{N} )$$, see [7, Remark 2.2], as follows: there exist two constants $$c,C>0$$ depending only on the sequence X but not on n, such that

$$c\max \bigl\{ E \bigl[ X_{n}^{4} \bigr] -3, \bigl\vert E \bigl[ X_{n}^{3} \bigr] \bigr\vert \bigr\} \leqslant d_{W} ( X_{n},\mathcal{N} ) \leqslant C\max \bigl\{ E \bigl[ X_{n}^{4} \bigr] -3, \bigl\vert E \bigl[ X_{n}^{3} \bigr] \bigr\vert \bigr\} .$$
(2.7)

Moreover, we recall that the third and fourth cumulants are, respectively,

\begin{aligned} &\kappa _{3}(X)=E \bigl[X^{3} \bigr]-3 E \bigl[X^{2} \bigr] E[X]+2 E[X]^{3}, \\ &\kappa _{4}(X)=E \bigl[X^{4} \bigr]-4 E[X] E \bigl[X^{3} \bigr]-3 E \bigl[X^{2} \bigr]^{2}+12 E[X]^{2} E \bigl[X^{2} \bigr]-6 E[X]^{4}. \end{aligned}

In particular, when $$E[X]=0$$, we have that

$$\kappa _{3}(X)=E \bigl[X^{3} \bigr]\quad \text{and}\quad \kappa _{4}(X)= E \bigl[X^{4} \bigr]-3 E \bigl[X^{2} \bigr]^{2}.$$

If $${g\in \mathcal{H}^{\otimes 2}}$$, then the third and fourth cumulants for $$I_{2}(g)$$ satisfy the following (see (6.2) and (6.6) in , respectively),

\begin{aligned} k_{3}\bigl(I_{2}(g)\bigr) =E\bigl[\bigl(I_{2}(g) \bigr)^{3}\bigr] =8\langle g,g\otimes _{1} g\rangle _{ \mathcal{H}^{\otimes 2}}, \end{aligned}
(2.8)

and

\begin{aligned} \bigl\vert k_{4}\bigl(I_{2}(g)\bigr) \bigr\vert =&16 \bigl( \Vert g\otimes _{1} g \Vert _{ \mathcal{H}^{\otimes 2}}^{2}+2 \Vert g\widetilde{\otimes _{1}} g \Vert _{ \mathcal{H}^{\otimes 2}}^{2} \bigr) \\ \leq &48 \Vert g\otimes _{1} g \Vert _{\mathcal{H}^{\otimes 2}}^{2}. \end{aligned}
(2.9)

### Lemma 2.1

()

Fix an integer $$M \geq 2$$. We have

$$\sum_{ \vert k_{j} \vert \leq n \atop 1 \leq j \leq M} \bigl\vert \rho ( \mathbf{k} \cdot \mathbf{v}) \bigr\vert \prod_{j=1}^{M} \bigl\vert \rho (k_{j} ) \bigr\vert \leq C \biggl(\sum _{ \vert k \vert \leq n} \bigl\vert \rho (k) \bigr\vert ^{1+ \frac{1}{M}} \biggr)^{M},$$

where $$\mathbf{k}= (k_{1}, \ldots , k_{M} )$$, and $$\mathbf{v} \in \mathbb{R}^{M}$$ is a fixed vector whose components are 1 or −1.

Throughout the paper $$\mathcal{N}$$ denotes a standard normal random variable. Also, C denotes a generic positive constant (perhaps depending on θ but not on anything else), which may change from line to line.

## 3 Approximate minimum contrast estimator

In this section, we prove the consistency and provide upper bounds in the Wasserstein distance for the rate of normal convergence of an approximate minimum contrast estimator of the drift parameter θ of the Ornstein-Uhlenbeck process $$X :=\{X_{t},t\geq 0 \}$$ driven by Brownian motion $$\{W_{t},t\geq 0 \}$$, defined as solution of the following linear stochastic differential equation

$$X_{0}=0;\qquad dX_{t}=-\theta X_{t}\,dt+dW_{t}, \quad t\geq 0,$$
(3.1)

where $$\theta >0$$ is an unknown parameter. Since (3.1) is linear, it is immediate to see that its solution can be expressed explicitly as

\begin{aligned} X_{t}= \int _{0}^{t}e^{-\theta (t-s)}\,dW_{s}. \end{aligned}
(3.2)

Moreover,

$$Z_{t}= \int _{-\infty }^{t}e^{-\theta (t-s)}\,dW_{s}$$
(3.3)

is a stationary Gaussian process, see [5, 9].

Furthermore,

$$X_{t}=Z_{t}-e^{-\theta t}Z_{0}.$$
(3.4)

Since $$Z :=\{Z_{t}, t\geq 0 \}$$ is a continuous centered stationary Gaussian process, then it can be represented as a Wiener-Itô (multiple) integral $$Z_{t} \overset{d}{=} I_{1}(\mathbf{1}_{[0, t]})$$ for every $$t\geq 0$$, according to (2.1). Let $$\rho (r)=E(Z_{r}Z_{0})$$ denote the covariance of Z for every $$r\geq 0$$. It is easy to show that

$$\rho (t)=E(Z_{t}Z_{0})=\frac{e^{-\theta |t|}}{2\theta},\quad t \in \mathbb{R}.$$

In particular, $$\rho (0)=\frac{1}{2\theta}$$. Moreover, notice that $$\rho (r)=\rho (-r)$$ for all $$r<0$$.

Our goal is to estimate θ based on the discrete observations of X, using the approximative minimum contrast estimator:

\begin{aligned} \begin{gathered} \widetilde{\theta}_{n}:= \frac{1}{2 (\frac{1}{n} \sum_{i=1}^{n}X_{t_{i}}^{2} )}= \frac{1}{2f_{n} (X )}=g\bigl(f_{n} (X )\bigr), \quad n \geq 1, \end{gathered} \end{aligned}
(3.5)

where $$g(x):=\frac{1}{2x}$$, $$t_{i}=i \Delta _{n}$$, $$i=0, \ldots , n$$, $$\Delta _{n} \rightarrow 0$$ and $$T=n \Delta _{n}$$, whereas $$f_{n} (X ),\ n\geq 1$$, are given by

\begin{aligned} f_{n}(X) :=\frac{1}{n} \sum _{i =0}^{n-1} X_{t_{i}}^{2}. \end{aligned}
(3.6)

To analyze the estimator $$\widetilde{\theta}_{n}$$ of θ based on discrete high-frequency data in time of X, we first estimate the limiting variance $$\rho (0)=\frac{1}{2\theta}$$ by the estimator $$f_{n} (X )$$, given by (3.6).

Let us introduce

$$F_{n}(Z):=\sqrt{T} \biggl(f_{n}(Z)-\frac{1}{2\theta} \biggr), \quad \text{where } f_{n}(Z) :=\frac{1}{n} \sum_{i =0}^{n-1} Z_{t_{i}}^{2}.$$

According to (2.2), $$F_{n}(Z)$$ can be written as

\begin{aligned} F_{n}(Z) = \sqrt{\frac{\Delta _{n}}{n}} \sum _{i =0}^{n-1} I_{2}\bigl( \mathbf{1}_{[0, t_{i}]}^{\otimes 2}\bigr)=I_{2} \Biggl(\sqrt{ \frac{\Delta _{n}}{n}} \sum_{i =0}^{n-1} \mathbf{1}_{[0, t_{i}]}^{ \otimes 2} \Biggr)=: I_{2}( \varepsilon _{n}). \end{aligned}
(3.7)

We will make use of the following technical lemmas.

### Lemma 3.1

Let X and Z be the processes given in (3.2) and (3.3), respectively. Then there exists $$C>0$$ depending only on θ such that for every $$p \geqslant 1$$ and for all $$n \in \mathbb{N}$$,

$$\bigl\Vert F_{n}(X)-F_{n}(Z) \bigr\Vert _{L^{p}(\Omega )}\leq \frac{C}{n\Delta _{n}}.$$
(3.8)

### Proof

By (3.4), we can write

\begin{aligned} \bigl\Vert F_{n}(X)-F_{n}(Z) \bigr\Vert _{L^{p}(\Omega )}\leq \frac{1}{n} \sum_{i =0}^{n-1} \bigl\Vert e^{-2\theta t_{i}}Z_{0}^{2}-2e^{-\theta t_{i}}Z_{0}Z_{t_{i}} \bigr\Vert _{L^{p}(\Omega )}. \end{aligned}

Combining this and the fact that Z is a stationary Gaussian process, we deduce

\begin{aligned} \bigl\Vert F_{n}(X)-F_{n}(Z) \bigr\Vert _{L^{p}(\Omega )} \leq & \frac{C}{n} \sum _{i =0}^{n-1} e^{-\theta t_{i}} \\ =& \frac{C}{n} \frac{1-e^{-n\theta \Delta _{n}}}{1-e^{-\theta \Delta _{n}}} \\ \leq & \frac{C}{n\Delta _{n}}, \end{aligned}

where we used $$\frac{\Delta _{n}}{1-e^{-\theta \Delta _{n}}}\rightarrow \frac{1}{\theta}$$ as $$n\rightarrow \infty$$. Thus, the desired result is obtained. □

### Lemma 3.2

There exists $$C>0$$ depending only on θ such that for large n

\begin{aligned} \biggl\vert E \bigl(F_{n}^{2}(Z) \bigr)- \frac{1}{2\theta ^{3}} \biggr\vert & \leq C \biggl(\Delta _{n}^{2}+ \frac{1}{n\Delta _{n}} \biggr). \end{aligned}
(3.9)

Consequently, using (3.8), for large n

\begin{aligned} \biggl\vert E \bigl(F_{n}^{2}(X) \bigr)- \frac{1}{2\theta ^{3}} \biggr\vert & \leq C \biggl(\Delta _{n}^{2}+ \frac{1}{n\Delta _{n}} \biggr). \end{aligned}
(3.10)

### Proof

Using the well-known Wick formula, we have

\begin{aligned} E \bigl(Z_{t}^{2}Z_{s}^{2} \bigr)=E \bigl(Z_{t}^{2} \bigr)E \bigl(Z_{s}^{2} \bigr)+2 \bigl(E (Z_{t}Z_{s} ) \bigr)^{2} = \rho ^{2}(0)+2 \rho ^{2}(t-s). \end{aligned}
(3.11)

This implies

\begin{aligned} E \bigl(F_{n}^{2}(Z) \bigr)&=T \biggl[Ef_{n}^{2}(Z)-2 \frac{1}{2\theta}Ef_{n}(Z)+\rho ^{2}(0) \biggr] \\ &=T \bigl[Ef_{n}^{2}(Z)-\rho ^{2}(0) \bigr] \\ &= T \Biggl[\frac{1}{n^{2}} \sum_{i,j =0}^{n-1} E \bigl(Z_{t_{i}}^{2}Z_{t_{j}}^{2} \bigr)-\rho ^{2}(0) \Biggr] \\ &= T \Biggl[\frac{2}{n^{2}} \sum_{i,j =0}^{n-1} \rho ^{2} (t_{j}-t_{i} ) \Biggr] \\ &= \frac{2\Delta _{n}}{n} \sum_{i,j =0}^{n-1} \rho ^{2} \bigl((j-i) \Delta _{n} \bigr)= \frac{2\Delta _{n}}{n} \sum_{i,j =0}^{n-1} \frac{e^{-2\theta |j-i|\Delta _{n}}}{(2\theta )^{2}} \\ &=\frac{\Delta _{n}}{2\theta ^{2}}+ \frac{\Delta _{n}}{\theta ^{2}n} \sum _{0\leq i< j\leq n-1} e^{-2\theta (j-i)\Delta _{n}} \\ &=\frac{\Delta _{n}}{2\theta ^{2}}+ \frac{\Delta _{n}}{\theta ^{2}n} \sum _{k=1}^{n-1} (n-k)e^{-2k \Delta _{n}\theta} \\ &=\frac{-\Delta _{n}}{2\theta ^{2}}+ \frac{\Delta _{n}}{\theta ^{2}} \sum _{k=0}^{n-1} e^{-2k \Delta _{n}\theta }- \frac{\Delta _{n}}{\theta ^{2}n} \sum_{k=1}^{n-1} ke^{-2k \Delta _{n} \theta }. \end{aligned}
(3.12)

Further,

\begin{aligned} &\frac{-\Delta _{n}}{2\theta ^{2}}+ \frac{\Delta _{n}}{\theta ^{2}} \sum_{k=0}^{n-1} e^{-2k \Delta _{n}\theta } \\ &\quad= \frac{-\Delta _{n}}{2\theta ^{2}}+\frac{\Delta _{n}}{\theta ^{2}} \frac{1-e^{-2n\theta \Delta _{n}}}{1-e^{-2\theta \Delta _{n}}} \\ &\quad=\frac{-\Delta _{n}}{2\theta ^{2}}+\frac{1}{\theta ^{2}} \frac{\Delta _{n}}{1-e^{-2\theta \Delta _{n}}} - \frac{1}{\theta ^{2}} \frac{\Delta _{n}}{1-e^{-2\theta \Delta _{n}}}e^{-2n\theta \Delta _{n}} \\ &\quad=\frac{-\Delta _{n}}{2\theta ^{2}}+\frac{1}{\theta ^{2}} \frac{1}{2\theta (1-\theta \Delta _{n} +o(\Delta _{n}))} - \frac{1}{\theta ^{2}} \frac{\Delta _{n}}{1-e^{-2\theta \Delta _{n}}}e^{-2\theta n\Delta _{n}} \\ &\quad=\frac{-\Delta _{n}}{2\theta ^{2}}+\frac{1}{2\theta ^{3}} \bigl(1+ \theta \Delta _{n} +\theta ^{2}\Delta _{n}^{2}+o \bigl(\Delta _{n}^{2}\bigr) \bigr) -\frac{1}{\theta ^{2}} \frac{\Delta _{n}}{1-e^{-2\theta \Delta _{n}}}e^{-2\theta n\Delta _{n}} \\ &\quad=\frac{1}{2\theta ^{3}} \bigl(1+\theta ^{2}\Delta _{n}^{2}+o\bigl(\Delta _{n}^{2} \bigr) \bigr) -\frac{1}{\theta ^{2}} \frac{\Delta _{n}}{1-e^{-2\theta \Delta _{n}}}e^{-2\theta n\Delta _{n}}. \end{aligned}
(3.13)

Moreover,

\begin{aligned} \frac{\Delta _{n}}{\theta ^{2}n}\sum_{k=1}^{n-1} ke^{-2k \Delta _{n} \theta} = \frac{1}{\theta ^{2}n\Delta _{n}}\sum_{k=1}^{n-1} (k \Delta _{n})e^{-2k \Delta _{n}\theta}\Delta _{n}, \end{aligned}
(3.14)

and as $$n\rightarrow \infty$$

\begin{aligned} \sum_{k=1}^{n-1} (k\Delta _{n})e^{-2k \Delta _{n}\theta}\Delta _{n} \longrightarrow \int _{0}^{\infty}xe^{-2\theta x}\,dx= \frac{1}{4\theta ^{2}}< \infty . \end{aligned}

Combining (3.12), (3.13), and (3.14) and $$\frac{\Delta _{n}}{1-e^{-2\theta \Delta _{n}}}\rightarrow \frac{1}{2\theta}$$, there exists $$C>0$$ depending only on θ such that for large n

\begin{aligned} \biggl\vert E \bigl(F_{n}^{2}(Z) \bigr)- \frac{1}{2\theta ^{3}} \biggr\vert & \leq C \biggl(\Delta _{n}^{2}+e^{-2\theta n\Delta _{n}}+ \frac{1}{n\Delta _{n}} \biggr) \\ &\leq C \biggl(\Delta _{n}^{2}+\frac{1}{n\Delta _{n}} \biggr). \end{aligned}

Therefore, the desired result is obtained. □

### Lemma 3.3

There exists $$C>0$$ depending only on θ such that for large n,

\begin{aligned}& \bigl\vert k_{3}\bigl(F_{n}(Z)\bigr) \bigr\vert \leq \frac{C}{ (n\Delta _{n} )^{3/2}}, \end{aligned}
(3.15)
\begin{aligned}& \bigl\vert k_{4}\bigl(F_{n}(Z)\bigr) \bigr\vert \leq C\frac{1}{n\Delta _{n}}. \end{aligned}
(3.16)

Consequently,

\begin{aligned} \max \bigl( \bigl\vert k_{3}\bigl(F_{n}(Z)\bigr) \bigr\vert , \bigl\vert k_{4}\bigl(F_{n}(Z)\bigr) \bigr\vert \bigr) \leq C\frac{1}{n\Delta _{n}}. \end{aligned}
(3.17)

### Proof

Using $$\mathbf{1}_{[0, s]}^{\otimes 2} \otimes _{1} \mathbf{1}_{[0, t]}^{ \otimes 2}= \langle \mathbf{1}_{[0, s]}, \mathbf{1}_{[0, t]} \rangle _{\mathcal{H}} \mathbf{1}_{[0, s]} \otimes \mathbf{1}_{[0, t]}=\rho (t-s) \mathbf{1}_{[0, s]} \otimes \mathbf{1}_{[0, t]}$$, we can write

\begin{aligned} \varepsilon _{n}\otimes _{1} \varepsilon _{n}=\frac{\Delta _{n}}{n} \sum_{i,j=0}^{n-1} \rho (t_{j}-t_{i})\mathbf{1}_{[0, t_{i}]} \otimes \mathbf{1}_{[0, t_{j}]}. \end{aligned}

Combining this with (2.8) and (3.7), we get

\begin{aligned} k_{3}\bigl(F_{n}(Z)\bigr)=k_{3} \bigl(I_{2}(\varepsilon _{n})\bigr) =&8\langle \varepsilon _{n},\varepsilon _{n}\otimes _{1} \varepsilon _{n}\rangle _{ \mathcal{H}^{\otimes 2}} \\ =&\frac{\Delta _{n}^{3/2}}{n^{3/2}} \sum_{i,j,k=0}^{n-1} \rho (t_{j}-t_{i}) \rho (t_{i}-t_{k}) \rho (t_{k}-t_{j}) \\ =&\frac{\Delta _{n}^{3/2}}{n^{3/2}} \sum_{i,j,k=0}^{n-1} \rho \bigl((j-i) \Delta _{n}\bigr)\rho \bigl((i-k)\Delta _{n}\bigr)\rho \bigl((k-j)\Delta _{n}\bigr) \\ \leq &\frac{\Delta _{n}^{3/2}}{n^{3/2}} \sum_{|k_{i}|< n,i=1,2,3} \rho (k_{1}\Delta _{n})\rho (k_{2}\Delta _{n})\rho (k_{3}\Delta _{n}) \\ \leq &\frac{\Delta _{n}^{3/2}}{n^{3/2}} \biggl(\sum_{|k|< n} \rho (k \Delta _{n}) \biggr)^{3}. \end{aligned}
(3.18)

On the other hand,

\begin{aligned} \sum_{|k|< n}\rho (k\Delta _{n}) =& \frac{1}{2\theta}\sum_{|k|< n}e^{- \theta |k|\Delta _{n}} \\ \leq &\frac{1}{\theta}\sum_{k=0}^{n-1}e^{-\theta k\Delta _{n}} \\ \leq & \frac{1-e^{-\theta n\Delta _{n}}}{\theta (1-e^{-\theta \Delta _{n}})} \\ \leq &\frac{C}{\Delta _{n}}. \end{aligned}
(3.19)

Combining (3.18) and (3.19) yields

\begin{aligned} k_{3}\bigl(F_{n}(Z)\bigr) \leq &\frac{C}{ (n\Delta _{n} )^{3/2}}, \end{aligned}

which implies (3.15).

Using (2.9) and (3.7), we get

\begin{aligned} \bigl\vert k_{4}\bigl(F_{n}(Z)\bigr) \bigr\vert \leq &48 \Vert \varepsilon _{n}\otimes _{1} \varepsilon _{n} \Vert _{\mathcal{H}^{\otimes 2}}^{2} \\ =&48\frac{\Delta _{n}^{2}}{n^{2}}\sum_{k_{1},k_{2},k_{3},k_{4}=0}^{n-1} \bigl\langle \mathbf{1}_{[0, t_{k_{1}}]}^{\otimes 2} \otimes _{1} \mathbf{1}_{[0, t_{k_{2}}]}^{\otimes 2}, \mathbf{1}_{[0, t_{k_{3}}]}^{ \otimes 2} \otimes _{1} \mathbf{1}_{[0, t_{k_{4}}]}^{\otimes 2} \bigr\rangle _{\mathcal{H}^{\otimes 2}} \\ =&48\frac{\Delta _{n}^{2}}{n^{2}}\sum_{k_{1},k_{2},k_{3},k_{4}=0}^{n-1} E[ Z_{t_{k_{1}}} Z_{t_{k_{2}}}]E[ Z_{t_{k_{3}}} Z_{t_{k_{4}}}]E[ Z_{t_{k_{1}}} Z_{t_{k_{3}}}]E[ Z_{t_{k_{2}}} Z_{t_{k_{4}}}] \\ =&48\frac{\Delta _{n}^{2}}{n^{2}}\sum_{k_{1},k_{2},k_{3},k_{4}=0}^{n-1} \rho (t_{k_{1}}-t_{k_{2}}) \rho (t_{k_{3}}-t_{k_{4}}) \rho (t_{k_{1}}-t_{k_{3}}) \rho (t_{k_{2}}-t_{k_{4}}), \end{aligned}

where we used

\begin{aligned} \mathbf{1}_{[0, s]}^{\otimes 2} \otimes _{1} \mathbf{1}_{[0, t]}^{ \otimes 2} &= \langle \mathbf{1}_{[0, s]}, \mathbf{1}_{[0, t]} \rangle _{\mathcal{H}}\mathbf{1}_{[0, s]} \otimes \mathbf{1}_{[0, t]} \\ &= E[ Z_{s} Z_{t}]\mathbf{1}_{[0, s]} \otimes \mathbf{1}_{[0, t]}. \end{aligned}
(3.20)

Furthermore,

\begin{aligned}& 48\frac{\Delta _{n}^{2}}{n^{2}}\sum_{k_{1},k_{2},k_{3},k_{4}=0}^{n-1} \rho (t_{k_{1}}-t_{k_{2}}) \rho (t_{k_{3}}-t_{k_{4}}) \rho (t_{k_{1}}-t_{k_{3}}) \rho (t_{k_{2}}-t_{k_{4}}) \\& \quad= 48\frac{\Delta _{n}^{2}}{n^{2}} \sum_{k_{1},k_{2},k_{3},k_{4}=0}^{n-1} \rho \bigl((k_{1}-k_{2})\Delta _{n}\bigr) \rho \bigl((k_{3}-k_{4})\Delta _{n}\bigr) \rho \bigl((k_{1}-k_{3})\bigr) \Delta _{n})\rho \bigl((k_{2}-k_{4})\bigr)\Delta _{n}) \\& \quad= 48\frac{\Delta _{n}^{2}}{n} \sum_{ \underset{i=1,2,3}{ \vert j_{i} \vert < n}} \bigl\vert \rho (j_{1}\Delta _{n} ) \rho (j_{2} \Delta _{n} ) \rho (j_{3}\Delta _{n} ) \rho \bigl((j_{1}+j_{2}-j_{3}) \Delta _{n} \bigr) \bigr\vert \\& \quad\leq C\frac{\Delta _{n}^{2}}{n} \biggl(\sum_{ \vert k \vert < n} \bigl\vert \rho (k \Delta _{n}) \bigr\vert ^{\frac{4}{3}} \biggr)^{3} \\& \quad\leq C\frac{1}{n\Delta _{n}} \biggl(\Delta _{n}\sum _{ \vert k \vert < n} \bigl\vert \rho (k \Delta _{n}) \bigr\vert ^{\frac{4}{3}} \biggr)^{3} \\& \quad\leq C\frac{1}{n\Delta _{n}}, \end{aligned}
(3.21)

where we used the the change of variables $$k_{1}-k_{2}=j_{1}$$, $$k_{2}-k_{4}=j_{2}$$ and $$k_{3}-k_{4}=j_{3}$$, and then applying the Brascamp-Lieb inequality given by Lemma 2.1. Therefore, the proof of (3.16) is complete. □

### Theorem 3.4

There exists $$C>0$$ depending only on θ such that for all $$n\geq 1$$,

\begin{aligned} d_{W} \bigl(\sqrt{2}\theta ^{3/2}F_{n}(X), \mathcal{N} \bigr) \leq & C \biggl(\Delta _{n}^{2}+ \frac{1}{n\Delta _{n}} \biggr). \end{aligned}

### Proof

Using (3.8) and (3.9), we obtain

\begin{aligned}& d_{W} \bigl(\sqrt{2}\theta ^{3/2}F_{n}(X), \mathcal{N} \bigr) \\& \quad\leq d_{W} \bigl(\sqrt{2}\theta ^{3/2}F_{n}(Z), \mathcal{N} \bigr)+ \bigl\Vert F_{n}(X)-F_{n}(Z) \bigr\Vert _{L^{2}(\Omega )} \\& \quad\leq d_{W} \biggl(\frac{F_{n}(Z)}{\sqrt{E(F_{n}^{2}(Z))}}, \mathcal{N} \biggr) +{E} \biggl\vert \frac{\sqrt{2}\theta ^{3/2}F_{n}(Z)}{\sqrt{E(F_{n}^{2}(Z))}} \biggl( \frac{1}{\sqrt{2}\theta ^{3/2}}-\sqrt{E \bigl(F_{n}^{2}(Z)\bigr)} \biggr) \biggr\vert + \frac{C}{n\Delta _{n}} \\& \quad\leq d_{W} \biggl(\frac{F_{n}(Z)}{\sqrt{E(F_{n}^{2}(Z))}}, \mathcal{N} \biggr) + \sqrt{2}\theta ^{3/2} \frac{ \vert E (F_{n}^{2}(Z) )-\frac{1}{2\theta ^{3}} \vert }{ \vert \frac{1}{\sqrt{2}\theta ^{3/2}}+\sqrt{E(F_{n}^{2}(Z))} \vert } +\frac{C}{n\Delta _{n}} \\& \quad\leq d_{W} \biggl(\frac{F_{n}(Z)}{\sqrt{E(F_{n}^{2}(Z))}}, \mathcal{N} \biggr) +C \biggl\vert E \bigl(F_{n}^{2}(Z) \bigr)- \frac{1}{2\theta ^{3}} \biggr\vert +\frac{C}{n\Delta _{n}} \\& \quad\leq d_{W} \biggl(\frac{F_{n}(Z)}{\sqrt{E(F_{n}^{2}(Z))}}, \mathcal{N} \biggr) +C \biggl(\Delta _{n}^{2}+\frac{1}{n\Delta _{n}} \biggr) \\& \quad\leq C \biggl(\Delta _{n}^{2}+\frac{1}{n\Delta _{n}} \biggr), \end{aligned}

where the latter inequality comes from (2.7) and (3.17). □

### Theorem 3.5

Suppose $$\Delta _{n}\rightarrow 0$$ and $$T\rightarrow \infty$$. Then, the estimator $$\widetilde{\theta}_{n}$$ of θ is weakly consistent, that is, $$\widetilde{\theta}_{n}\rightarrow \theta$$ in probability, as $$\Delta _{n}\rightarrow 0$$ and $$T\rightarrow \infty$$.

If, moreover, $$n \Delta _{n}^{\eta }\rightarrow 0$$ for some $$1<\eta <2$$ or $$n \Delta _{n}^{\eta }\rightarrow \infty$$ for some $$\eta >1$$, then $$\widetilde{\theta}_{n}$$ is strongly consistent, that is, $$\widetilde{\theta}_{n}\rightarrow \theta$$ almost surely.

### Proof

Using (3.5), it is sufficient to prove that the results of the theorem are satisfied for the estimator $$f_{n}(X)$$ of $$\frac{1}{2\theta}$$.

The weak consistency of $$f_{n}(X)$$ is an immediate consequence from (3.10).

If $$n \Delta _{n}^{\eta }\rightarrow 0$$ for some $$1<\eta <2$$, the strong consistency of $$f_{n}(X)$$ has been proved by [10, Theorem 11].

Now, suppose that $$n \Delta _{n}^{\eta }\rightarrow \infty$$ for some $$\eta >1$$. It follows from (3.10) that

$$E \biggl[ \biggl(f_{n}(X)-\frac{1}{2\theta} \biggr)^{2} \biggr] \leq \frac{C}{n\Delta _{n}} \leq \frac{C}{n^{1- 1 / \eta } (n\Delta _{n}^{\eta } )^{1/ \eta }} \leq \frac{C}{n^{1- 1 / \eta }}.$$

Combining this with the hypercontractivity property (2.6) and [13, Lemma 2.1], which is a well-known direct consequence of the Borel-Cantelli Lemma, we obtain $$f_{n}(X)\rightarrow \frac{1}{2\theta}$$ almost surely. □

### Theorem 3.6

There exists $$C>0$$ depending only on θ such that for all $$n\geq 1$$,

\begin{aligned} d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widetilde{ \theta}_{n}- \theta ), \mathcal{N} \biggr) \leq C \biggl(\Delta _{n}^{2}+ \frac{1}{\sqrt{n\Delta _{n}}} \biggr). \end{aligned}
(3.22)

### Proof

Recall that by definition $$\theta =g (\frac{1}{2\theta} )$$. We have

$$(\widetilde{\theta}_{n}-\theta )= \biggl(g\bigl(f_{n} (X )\bigr)-g \biggl(\frac{1}{2\theta} \biggr) \biggr)=g^{\prime} \biggl( \frac{1}{2\theta} \biggr) \biggl(f_{n} (X )- \frac{1}{2\theta} \biggr)+\frac{1}{2} g^{\prime \prime} (\zeta _{n} ) \biggl(f_{n} (X )-\frac{1}{2\theta} \biggr)^{2}$$

for some random point $$\zeta _{n}$$ between $$f_{n} (X )$$ and $$\frac{1}{2\theta}$$.

Thus, we can write

$$\sqrt{\frac{T}{2\theta}} (\widetilde{\theta}_{n}-\theta )=- \sqrt{2}\theta ^{3/2}F_{n} (X )+ \frac{1}{2^{3/2}\sqrt{\theta T}\zeta _{n}^{3}} \bigl(F_{n} (X ) \bigr)^{2}.$$

Therefore,

\begin{aligned} d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widetilde{ \theta}_{n}- \theta ), \mathcal{N} \biggr) \leq \frac{1}{2^{3/2}\sqrt{\theta T}} {E} \biggl\vert \frac{1}{\zeta _{n}^{3}} \bigl(F_{n} (X ) \bigr)^{2} \biggr\vert +d_{W} \bigl(\sqrt{2} \theta ^{3/2}F_{n} (X ), \mathcal{N} \bigr), \end{aligned}
(3.23)

where we have used that $$d_{W} (x_{1}+x_{2}, y ) \leq {E} [ \vert x_{2} \vert ]+d_{W} (x_{1}, y )$$ for any random variables $$x_{1}$$, $$x_{2}$$, y.

The second term in the inequality above is bounded in Theorem 3.4. By Hölder’s inequality, and the hypercontractivity property (2.6), for $$p, q>1$$ with $$1 / p+$$ $$1 / q=1$$

\begin{aligned} {E} \biggl\vert \frac{1}{\zeta _{n}^{3}} \bigl(F_{n} (X ) \bigr)^{2} \biggr\vert \leq & \biggl({E} \biggl\vert \frac{1}{\zeta _{n}^{3}} \biggr\vert ^{p} \biggr)^{1 / p} \bigl({E} \bigl\vert F_{n} (X ) \bigr\vert ^{2 q} \bigr)^{1 / q} \\ \leq & c_{p,q} \biggl({E} \biggl\vert \frac{1}{\zeta _{n}^{3}} \biggr\vert ^{p} \biggr)^{1/p} {E} \bigl\vert F_{n} (X ) \bigr\vert ^{2}, \\ \leq & C \biggl({E} \biggl\vert \frac{1}{\zeta _{n}^{3}} \biggr\vert ^{p} \biggr)^{1/p}, \end{aligned}
(3.24)

for some constant $$C>0$$ depending on p.

Consequently, using (3.23), (3.24) and Theorem 3.4, we deduce that for every $$p\geq 1$$

$$d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widetilde{ \theta}_{n}- \theta ), \mathcal{N} \biggr) \leq \frac{C}{\sqrt{n\Delta _{n}}} \biggl({E} \biggl\vert \frac{1}{\zeta _{n}^{3}} \biggr\vert ^{p} \biggr)^{1/p} + C \biggl(\Delta _{n}^{2}+ \frac{1}{n\Delta _{n}} \biggr).$$

To establish (3.22), it is left to show that $${E} \vert \zeta _{n} \vert ^{-3p} < \infty$$ for some $$p \geq 1$$. Using the monotonocity of $$x^{-3}$$ and the fact that $$\zeta _{n} \in [|f_{n} (X ), \frac{1}{2\theta}|]$$, it is enough to show that $$E |f_{n} (X )|^{-3p} < \infty$$ for some $$p \geq 1$$. This follows as an application of the technical [7, Proposition 6.3]. □

## 4 Approximate maximum likelihood estimator

In this section, we study an approximate maximum likelihood estimator of θ based on discrete observations of X.

The maximum likelihood estimator for θ based on continuous observations of the process X given by (3.1) is defined by

\begin{aligned} \check{\theta}_{T}=- \frac{\int _{0}^{T} X_{s} \,\mathrm{d} X_{s}}{\int _{0}^{T} X_{s}^{2} \,\mathrm{d} s}, \quad T\geq 0. \end{aligned}
(4.1)

Here we want to study the asymptotic distribution of a discrete version of (4.1). Then, we assume that the process X given in (3.1) is observed equidistantly in time with the step size $$\Delta _{n}$$: $$t_{i}=i \Delta _{n}$$, $$i=0, \ldots , n$$, and $$T=n \Delta _{n}$$ denotes the length of the “observation window”. Let us consider the following discrete version of $$\check{\theta}_{T}$$:

$$\widehat{\theta}_{n}=- \frac{\sum_{i=1}^{n} X_{t_{i-1}} (X_{t_{i}}-X_{t_{i-1}} )}{\Delta _{n} \sum_{i=1}^{n} X_{t_{i-1}}^{2}},\quad n\geq 1.$$

Note that  and , respectively, proved the weak and strong consistency of the estimator $$\widehat{\theta}_{n}$$ as $$T \rightarrow \infty$$ and $$\Delta _{n} \rightarrow 0$$.

Let X be the process given by (3.1), and let us introduce the following sequences

$$S_{n}:=\Delta _{n} \sum_{i=1}^{n} X_{t_{i-1}}^{2},$$

and

$$\Lambda _{n}:=\sum_{i=1}^{n} e^{-\theta t_{i}}X_{t_{i-1}} ( \zeta _{t_{i}}-\zeta _{t_{i-1}} )=\sum_{i=1}^{n} e^{-\theta (t_{i}+t_{i-1})} \zeta _{t_{i-1}} (\zeta _{t_{i}}-\zeta _{t_{i-1}} ),$$

where

$$\zeta _{t}= \int _{0}^{t}e^{\theta s}\,dW_{s}.$$

Thus,

$$-\widehat{\theta}_{n}=\frac{e^{-\theta \Delta _{n}}-1}{\Delta _{n}}+ \frac{\Lambda _{n}}{S_{n}}.$$

Therefore,

\begin{aligned} \sqrt{T} (\theta -\widehat{\theta}_{n} ) =&\sqrt{T} \biggl( \frac{e^{-\theta \Delta _{n}}-1}{\Delta _{n}}+\theta \biggr)+ \frac{\frac{1}{\sqrt{T}} \Lambda _{n}}{\frac{1}{T} S_{n}} \\ =&\sqrt{T} \biggl(\frac{e^{-\theta \Delta _{n}}-1}{\Delta _{n}}+ \theta \biggr)+\frac{\frac{1}{\sqrt{T}} \Lambda _{n}}{f_{n}(X)} \\ =&\sqrt{T} \biggl(\frac{\theta ^{2}}{2}\Delta _{n}+o(\Delta _{n}) \biggr)+\frac{\frac{1}{\sqrt{T}} \Lambda _{n}}{f_{n}(X)} \\ =&\sqrt{n\Delta _{n}^{3}} \biggl(\frac{\theta ^{2}}{2}+o(1) \biggr)+ \frac{\frac{1}{\sqrt{T}} \Lambda _{n}}{f_{n}(X)}, \end{aligned}
(4.2)

where $$f_{n}(X)$$ is given by (3.6).

Next, since $$\zeta _{t_{i-1}}$$ and $$\zeta _{t_{i}}-\zeta _{t_{i-1}}$$ are independent, we have

\begin{aligned} E \biggl[ \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)^{2} \biggr] =& \frac{1}{T}\sum _{i,j=1}^{n} e^{-\theta (t_{i}+t_{i-1}+t_{j}+t_{j-1})} E \bigl[\zeta _{t_{i-1}} (\zeta _{t_{i}}-\zeta _{t_{i-1}} ) \zeta _{t_{j-1}} (\zeta _{t_{j}}-\zeta _{t_{j-1}} ) \bigr] \\ =&\frac{1}{T}\sum_{i=1}^{n} e^{-2\theta (t_{i}+t_{i-1})} E \bigl[ \zeta _{t_{i-1}}^{2} (\zeta _{t_{i}}-\zeta _{t_{i-1}} )^{2} \bigr] \\ =&\frac{1}{T}\sum_{i=1}^{n} e^{-2\theta (t_{i}+t_{i-1})} E \bigl[ \zeta _{t_{i-1}}^{2} \bigr] E \bigl[ (\zeta _{t_{i}}-\zeta _{t_{i-1}} )^{2} \bigr] \\ =&\frac{1}{T}\sum_{i=1}^{n} e^{-2\theta (t_{i}+t_{i-1})} \biggl( \frac{e^{2\theta t_{i-1}}-1}{2\theta} \biggr) \biggl( \frac{e^{2\theta t_{i}}-e^{2\theta t_{i-1}}}{2\theta} \biggr) \\ =& \frac{ (1-e^{-2\theta \Delta _{n}} )}{(2\theta )^{2}\Delta _{n}} \frac{1}{n}\sum _{i=1}^{n} \bigl(1-e^{-2\theta t_{i-1}} \bigr) \\ =& \frac{ (1-e^{-2\theta \Delta _{n}} )}{(2\theta )^{2}\Delta _{n}} - \frac{ (1-e^{-2\theta \Delta _{n}} )}{(2\theta )^{2}\Delta _{n}} \biggl(\frac{1-e^{-2\theta T}}{n(1-e^{-2\theta \Delta _{n}})} \biggr). \end{aligned}

Moreover, since

$$\frac{ (1-e^{-2\theta \Delta _{n}} )}{(2\theta )^{2}\Delta _{n}}= \frac{1}{2\theta}-\frac{\Delta _{n}}{2}+o(\Delta _{n}),$$

there exists $$C>0$$ depending only on θ such that for large n

\begin{aligned} \biggl\vert E \biggl[ \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)^{2} \biggr]-\frac{1}{2\theta} \biggr\vert &\leq C \biggl(\Delta _{n}+ \frac{1}{n\Delta _{n}} \biggr). \end{aligned}
(4.3)

Using $$E[\Lambda _{n}]=0$$ and the fact that $$\zeta _{t_{i-1}}$$ and $$\zeta _{t_{i}}-\zeta _{t_{i-1}}$$ are independent, we get

\begin{aligned} \kappa _{3} \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)=E \biggl[ \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)^{3} \biggr]=0. \end{aligned}
(4.4)

On the other hand,

\begin{aligned}& E \biggl[ \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)^{4} \biggr]\\& \quad= \frac{1}{T^{2}}\sum _{i,j,k,l=1}^{n} e^{-\theta (t_{i}+t_{i-1}+t_{j}+t_{j-1}+t_{k}+t_{k-1}+t_{l}+t_{l-1})} \\& \qquad{}\times E \bigl[\zeta _{t_{i-1}} (\zeta _{t_{i}}-\zeta _{t_{i-1}} )\zeta _{t_{j-1}} (\zeta _{t_{j}}-\zeta _{t_{j-1}} ) \zeta _{t_{k-1}} (\zeta _{t_{k}}-\zeta _{t_{k-1}} )\zeta _{t_{l-1}} (\zeta _{t_{l}}-\zeta _{t_{l-1}} ) \bigr] \\& \quad= \frac{1}{T^{2}}\sum_{i=1}^{n} e^{-4\theta (t_{i}+t_{i-1})} E \bigl[\zeta _{t_{i-1}}^{4} (\zeta _{t_{i}}-\zeta _{t_{i-1}} )^{4} \bigr] \\& \qquad{}+\frac{3}{T^{2}}\sum_{i=j\neq k=l}^{n} e^{-2\theta (t_{i}+t_{i-1}+t_{k}+t_{k-1})} E \bigl[\zeta _{t_{i-1}}^{2} (\zeta _{t_{i}}-\zeta _{t_{i-1}} )^{2}\zeta _{t_{k-1}}^{2} (\zeta _{t_{k}}-\zeta _{t_{k-1}} )^{2} \bigr] \\& \quad= \frac{6}{T^{2}}\sum_{i=1}^{n} e^{-4\theta (t_{i}+t_{i-1})} \bigl(E \bigl[\zeta _{t_{i-1}}^{2} \bigr] \bigr)^{2} \bigl(E \bigl[ ( \zeta _{t_{i}}-\zeta _{t_{i-1}} )^{2} \bigr] \bigr)^{2} \\& \qquad{}+3 \Biggl[\frac{1}{T}\sum_{i=1}^{n} e^{-2\theta (t_{i}+t_{i-1})} E \bigl[\zeta _{t_{i-1}}^{2} (\zeta _{t_{i}}-\zeta _{t_{i-1}} )^{2} \bigr] \Biggr]^{2} \\& \quad= \frac{6}{T^{2}}\sum_{i=1}^{n} e^{-4\theta (t_{i}+t_{i-1})} \bigl(E \bigl[\zeta _{t_{i-1}}^{2} \bigr] \bigr)^{2} \bigl(E \bigl[ ( \zeta _{t_{i}}-\zeta _{t_{i-1}} )^{2} \bigr] \bigr)^{2} +3 \biggl[E \biggl[ \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)^{2} \biggr] \biggr]^{2}. \end{aligned}

This implies

\begin{aligned} \kappa _{4} \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr) =&E \biggl[ \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)^{4} \biggr]-3 \biggl[E \biggl[ \biggl(\frac{1}{\sqrt{T}} \Lambda _{n} \biggr)^{2} \biggr] \biggr]^{2} \\ =&\frac{6}{T^{2}}\sum_{i=1}^{n} e^{-4\theta (t_{i}+t_{i-1})} \bigl(E \bigl[\zeta _{t_{i-1}}^{2} \bigr] \bigr)^{2} \bigl(E \bigl[ ( \zeta _{t_{i}}-\zeta _{t_{i-1}} )^{2} \bigr] \bigr)^{2} \\ =&\frac{6}{T^{2}}\sum_{i=1}^{n} e^{-4\theta (t_{i}+t_{i-1})} \biggl( \frac{e^{2\theta t_{i-1}}-1}{2\theta} \biggr)^{2} \biggl( \frac{e^{2\theta t_{i}}-e^{2\theta t_{i-1}}}{2\theta} \biggr)^{2} \\ =& \frac{6 (1-e^{-2\theta \Delta _{n}} )^{2}}{(2\theta )^{4}\Delta _{n}^{2}} \frac{1}{n^{2}}\sum _{i=1}^{n} \bigl(1-e^{-2\theta t_{i-1}} \bigr)^{2} \\ \leq & \frac{6 (1-e^{-2\theta \Delta _{n}} )^{2}}{(2\theta )^{4}\Delta _{n}^{2}} \frac{1}{n} \\ \leq &\frac{C}{n}, \end{aligned}
(4.5)

where the latter inequality comes from the fact that $${\frac{1-e^{-2\theta \Delta _{n}}}{\Delta _{n}}} \rightarrow 2\theta$$ as $$n\rightarrow \infty$$.

### Theorem 4.1

There exists a constant $$C>0$$ such that, for all $$n\geq 1$$,

\begin{aligned} d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widehat{ \theta}_{n}- \theta ),\mathcal{N} \biggr) \leq &C \biggl( \frac{1}{\sqrt{n\Delta _{n}}}+\sqrt{n\Delta _{n}^{3}} \biggr). \end{aligned}
(4.6)

### Proof

Define $$G_{n}:=\frac{1}{\sqrt{T}} \Lambda _{n}$$. Using (2.7), (4.4), and (4.5), we have

\begin{aligned} d_{W} \biggl(\frac{G_{n}}{\sqrt{E(G_{n}^{2})}},\mathcal{N} \biggr) \leq \frac{C}{n}. \end{aligned}
(4.7)

Combining (4.7) with (4.2), (4.3), and (3.10), we obtain

\begin{aligned}& d_{W} \biggl(\sqrt{\frac{T}{2\theta}} (\widehat{ \theta}_{n}- \theta ),\mathcal{N} \biggr) \\& \quad\leq d_{W} \biggl( \frac{1}{\sqrt{2\theta}} \frac{G_{n}}{f_{n}(X)}, \mathcal{N} \biggr)+C\sqrt{n\Delta _{n}^{3}} \\& \quad\leq d_{W} \biggl(\frac{G_{n}}{\sqrt{E(G_{n}^{2})}},\mathcal{N} \biggr)+{E} \biggl\vert \frac{G_{n}}{\sqrt{E(G_{n}^{2})}f_{n}(X)} \biggl( \frac{\sqrt{E(G_{n}^{2})}}{\sqrt{2\theta}}-f_{n}(X) \biggr) \biggr\vert +C \sqrt{n\Delta _{n}^{3}} \\& \quad\leq d_{W} \biggl(\frac{G_{n}}{\sqrt{E(G_{n}^{2})}},\mathcal{N} \biggr)+ \biggl\Vert \frac{G_{n}}{\sqrt{E(G_{n}^{2})}} \biggr\Vert _{L^{4}( \Omega )} \biggl\Vert \frac{1}{f_{n}(X)} \biggr\Vert _{L^{4}(\Omega )} \biggl\Vert \frac{\sqrt{E(G_{n}^{2})}}{\sqrt{2\theta}}-f_{n}(X) \biggr\Vert _{L^{2}( \Omega )}\\& \qquad{}+C \sqrt{n\Delta _{n}^{3}} \\& \quad\leq C \biggl(\frac{1}{n}+\frac{1}{\sqrt{n\Delta _{n}}} \biggr)+C \sqrt{n \Delta _{n}^{3}} \\& \quad\leq C \biggl(\frac{1}{\sqrt{n\Delta _{n}}}+\sqrt{n\Delta _{n}^{3}} \biggr), \end{aligned}

where we used the fact that $$E |f_{n} (X )|^{-4} < \infty$$, which is a direct application of the technical [7, Proposition 6.3]. The proof of (4.6) is thus complete. □

Not applicable.

## References

1. Biermé, H., Bonami, A., Nourdin, I., Peccati, G.: Optimal Berry-Esseen rates on the Wiener space: the barrier of third and fourth cumulants. ALEA Lat. Am. J. Probab. Math. Stat. 9(2), 473–500 (2012)

2. Bishwal, J.P.: Rates of weak convergence of approximate minimum contrast estimators for the discretely observed Ornstein-Uhlenbeck process. Stat. Probab. Lett. 76(13), 1397–1409 (2006)

3. Bishwal, J.P.: Uniform rate of weak convergence of the minimum contrast estimator in the Ornstein-Uhlenbeck process. Methodol. Comput. Appl. Probab. 12(3), 323–334 (2010)

4. Bishwal, J.P.N., Bose, A.: Rates of convergence of approximate maximum likelihood estimators in the Ornstein-Uhlenbeck process. Comput. Math. Appl. 42(1–2), 23–38 (2001)

5. Cheridito, P., Kawaguchi, H., Maejima, M.: Fractional Ornstein-Uhlenbeck processes. Electron. J. Probab. 8, 1–14 (2003)

6. Dorogovcev, A.Ja.: The consistency of an estimate of a parameter of stochastic differential equation. Theory Probab. Math. Stat. 10, 73–82 (1976)

7. Douissi, S., Es-Sebaiy, K., Kerchev, G., Nourdin, N.: Berry-Esseen bounds of second moment estimators for Gaussian processes observed at high frequency. Electron. J. Stat. 16(1), 636–670 (2022). https://doi.org/10.1214/21-EJS1967

8. Es-Sebaiy, K., Al-Foraih, M., Alazemi, F.: Wasserstein bounds in the CLT of the MLE for the drift coefficient of a stochastic partial diffenrential equation. Fractal Fract. 5, 187 (2021)

9. Es-Sebaiy, K., Viens, F.: Optimal rates for parameter estimation of stationary Gaussian processes. Stoch. Process. Appl. 129(9), 3018–3054 (2019)

10. Hu, Y., Nualart, D., Zhou, H.: Parameter estimation for fractional Ornstein-Uhlenbeck processes of general Hurst parameter. Stat. Inference Stoch. Process. 22(1), 111–142 (2019)

11. Kasonga, R.A.: The consistency of a nonlinear least squares estimator from diffusion processes. Stoch. Process. Appl. 30, 263–275 (1988)

12. Kim, Y.T., Park, H.S.: Optimal Berry-Esseen bound for an estimator of parameter in the Ornstein-Uhlenbeck process. J. Korean Stat. Soc. 46(3), 413–425 (2017)

13. Kloeden, P., Neuenkirch, A.: The pathwise convergence of approximation schemes for stochastic differential equations. LMS J. Comput. Math. 10, 235–253 (2007)

14. Kutoyants, Y.A.: Statistical Inference for Ergodic Diffusion Processes. Springer, Berlin (2004)

15. Liptser, R.S., Shiryaev, A.N.: Statistics of Random Processes: II Applications, 2nd edn. Applications of Mathematics. Springer, Berlin, Heidelberg, New York (2001)

16. Nourdin, I., Peccati, G.: Normal Approximations with Malliavin Calculus: From Stein’s Method to Universality. Cambridge Tracts in Mathematics, vol. 192. Cambridge University Press, Cambridge (2012)

17. Nourdin, I., Peccati, G.: The optimal fourth moment theorem. Proc. Am. Math. Soc. 143, 3123–3133 (2015)

18. Nualart, D.: The Malliavin Calculus and Related Topics. Springer, Berlin (2006)

19. Nualart, D., Zhou, H.: Total variation estimates in the Breuer-Major theorem. Ann. Inst. Henri Poincaré Probab. Stat. 57(2), 740–777 (2021)

## Funding

This project was funded by Kuwait Foundation for the Advancement of Sciences (KFAS) under project code: PR18-16SM-04.

## Author information

Authors

### Contributions

Investigation, K.E., F.A. and M.A.; Methodology, K.E., F.A. and M.A.; Writing—review and editing, K.E., F.A. and M.A.. All authors have read and agreed to the published version of the manuscript.

### Corresponding author

Correspondence to Khalifa Es-Sebaiy.

## Ethics declarations

### Competing interests

The authors declare no competing interests. 