# The large deviation for the least squares estimator of nonlinear regression model based on WOD errors

## Abstract

As a kind of dependent random variables, the widely orthant dependent random variables, or WOD for short, have a very important place in dependence structures for the intricate properties. And so its behavior and properties in different statistical models will be a major part in our research interest. Based on WOD errors, the large deviation results of the least squares estimator in the nonlinear regression model are established, which extend the corresponding ones for independent errors and some dependent errors.

## Introduction

Many researchers have paid attention to the study of the probability limit theorem and its applications for the independent random variables, while the fact is that most of the random variables found in real practice are dependent, which just motivates the authors’ interests in how well the dependent random variables will behave in some cases.

One of the important dependence structures is the widely orthant dependence structure. The main purpose of the paper is to study the large deviation for the least squares estimator of the nonlinear regression model based on widely orthant dependent errors.

### Brief review

Consider the following nonlinear regression model:

$$X_{i}=f_{i}(\theta)+\xi_{i}, \quad i\geq1,$$
(1.1)

where $$\{X_{i}\}$$ is observed, $$\{f_{i}(\theta)\}$$ is a known sequence of continuous functions possibly nonlinear in $$\theta\in\Theta$$, Θ denotes a closed interval on the real line, and $$\{\xi_{i}\}$$ is a mean zero sequence of random errors. Denote

$$Q_{n}(\theta)=\frac{1}{2}\sum ^{n}_{i=1}\omega ^{2}_{i} \bigl(x_{i}-f_{i}(\theta) \bigr)^{2},$$
(1.2)

where $$\{\omega_{i}\}$$ is a known sequence of positive numbers. An estimator $$\theta_{n}$$ is said to be a least squares estimator of θ if it minimizes $$Q_{n}(\theta)$$ over $$\theta\in\Theta$$, i.e. $$Q_{n}(\theta_{n})=\inf_{\theta\in\Theta }Q_{n}(\theta)$$.

Noting that $$Q(x_{1},x_{2},\ldots,x_{n};\theta)=Q_{n}(\theta)$$ is defined on $$\mathbf{R}^{n}\times\Theta$$, where Θ is compact. Furthermore, $$Q (x;\theta)$$, where $$x=(x_{1},x_{2},\ldots,x_{n})$$, is a Borel measurable function of x for any fixed $$\theta\in\Theta$$ and a continuous function of θ for any fixed $$x\in\mathbf{R}^{n}$$. Lemma 3.3 of Schmetterer  shows that there exists a Borel measurable map $$\theta_{n}:\mathbf{R}^{n}\to\Theta$$ such that $$Q_{n}(\theta_{n})=\inf_{\theta\in\Theta}Q_{n}(\theta)$$. In the following, we will consider this version as the least squares estimator $$\theta_{n}$$.

Let $$\theta_{0}$$ be the true parameter and assume that $$\theta_{0}\in\Theta$$. Ivanov  established the following large deviation result for independent and identically distributed (i.i.d.) random variables.

### Theorem 1.1

Let $$\{\xi_{i},i\geq1\}$$ be i.i.d. with $$E|\xi_{i}|^{p}<\infty$$ for some $$p>2$$. Suppose that there exist some constants $$0< c_{1}< c_{2}<\infty$$ such that

$$c_{1}(\theta_{1}-\theta_{2})^{2} \leq\frac{1}{n}\sum^{n}_{i=1} \bigl(f_{i}(\theta_{1})-f_{i}( \theta_{2}) \bigr)^{2}\leq c_{2}(\theta _{1}-\theta_{2})^{2}, \quad \forall n\geq1,$$
(1.3)

for all $$\theta_{1},\theta_{2}\in\Theta$$. Then, for every $$\rho >0$$, it has

$$P \bigl(n^{1/2}|\theta_{n}- \theta_{0}|> \rho \bigr)\leq c\rho ^{-p},\quad\forall n\geq1,$$
(1.4)

where c is a positive constant independent of n and ρ.

Hu  also got the result (1.4) and gave its application to martingale difference, φ-mixing sequence and negatively associated (NA, in short) sequence. In addition, Hu  proved the following large deviation result:

$$P \bigl(n^{1/2}|\theta_{n}- \theta_{0}|> \rho \bigr)\leq cn^{1-\rho/2}\rho^{-p},$$
(1.5)

under the condition that $$\sup_{n\geq1}E|\xi_{n}|^{p}<\infty$$ for some $$1< p\leq2$$, and Hu gave its application to the martingale difference, the φ-mixing sequence, the NA sequence, and the weakly stationary linear process. Recently, Yang and Hu  obtained some large deviation results based on ρ̃-mixing, asymptotically almost negatively associated, negatively orthant dependent and $$L_{p}$$-mixingales random errors. For more details as regards the nonlinear regression model, one can refer to Ibregimov and Has’minskii , Ivanov and Leonenko , Ivanov , and so on. In this paper, the large deviation results for the least squares estimator of the nonlinear regression model based on the WOD error will be investigated.

Inspired by the above literature, we will establish the large deviation results based on widely orthant dependent errors.

### Concept of widely orthant dependence structure

In this section, we will present the widely orthant dependence structure, which was introduced by Wang et al. .

### Definition 1.1

For the random variables $$\{X_{n}, n \geq1\}$$, if there exists a finite real sequence $$\{f_{U}(n),n\geq 1 \}$$ satisfying for each $$n \geq1$$ and for all $$x_{i}\in\mathbf{R}$$, $$1 \leq i\leq n$$,

$$P(X_{1}>x_{1},X_{2}>x_{2}, \ldots,X_{n}>x_{n}) \leq f_{U}(n)\prod ^{n}_{i=1}P(X_{i}>x_{i}),$$
(1.6)

then we say that the $$\{X_{n},n\geq 1\}$$ are widely upper orthant dependent (WUOD, in short); if there exists a finite real sequence $$\{f_{L}(n),n \geq 1\}$$ satisfying for each $$n\geq1$$ and for all $$x_{i}\in\mathbf{R}$$, $$1 \leq i\leq n$$,

$$P(X_{1}\leq x_{1},X_{2}\leq x_{2},\ldots,X_{n} \leq x_{n}) \leq f_{L}(n)\prod^{n}_{i=1}P(X_{i} \leq x_{i}),$$
(1.7)

then we say that the $$\{X_{n}, n \geq 1\}$$ are widely lower orthant dependent (WLOD, in short); if they are both WUOD and WLOD, then we say that the $$\{X_{n}, n \geq 1\}$$ are widely orthant dependent (WOD, in short), and $$f_{U}(n)$$, $$f_{L}(n)$$, $$n \geq1$$ are called dominating coefficients.

An array $$\{X_{ni},i\geq1,n\geq1\}$$ of random variables is called a row-wise WOD if for every $$n\geq1$$, $$\{X_{ni},i\geq1\}$$ is a sequence of WOD random variables.

As mentioned above, Wang et al.  first introduced the concept of WOD random variables. Their properties and applications have been studied consequently. For instance, WOD random variables include some common negatively dependent random variables, some positively dependent random variables and others, which were shown in the examples provided by Wang et al.  and the uniform asymptotic for the finite-time ruin probability of a new dependent risk model with a constant interest rate was also investigated in the same work. He et al.  established the asymptotic lower bounds of precise large deviations with non-negative and dependent random variables. The uniform asymptotic for the finite time ruin probabilities of two types of non-standard bidimensional renewal risk models with constant interest forces and diffusion generated by Brownian motions was proposed by Chen et al. . The Bernstein type inequality for WOD random variables and its applications were studied by Shen . Wang et al.  investigated the complete convergence for WOD random variables and gave its applications to nonparametrics regression models, and so forth.

As is well known, the class of WOD random variables contains END random variables, NOD random variables, NSD random variables, NA random variables, and independent random variables as special cases. Hence, it is meaningful to extend the results of Yang and Hu  to WOD errors.

Throughout this paper, let $$\{\xi_{i}, i \geq1\}$$ be a sequence of WOD random variables with dominating coefficients $$f_{U}(n)$$, $$f_{L}(n)$$, $$n \geq1$$. Denote $$f(n)=\max \{f_{U}(n),f_{L}(n) \}$$. Let C denote a positive constant, which may vary in different spaces. Let $$\lfloor x \rfloor$$ be the integer part of x.

The main results and their proofs are presented in Section 3 and for the convenience of the reader, some useful lemmas relating to the proofs are listed in Section 2.

## Preliminary lemmas

In this section, we provide some important lemmas will be used to prove the main results of the paper. The first one is the basic property for WOD random variables, which was established by Wang et al. .

### Lemma 2.1

Let $$\{X_{n}, n \geq1\}$$ be a sequence of WOD random variables.

1. (i)

If $$\{h_{n}(\cdot), n \geq1\}$$ are all non-decreasing (or all non-increasing), then $$\{h_{n}(X_{n}), n \geq1\}$$ are still WOD.

2. (ii)

For each $$n\geq1$$ and any $$s\in\mathbf{R}$$,

$$E\exp \Biggl\{ s\sum^{n}_{i=1}X_{i} \Biggr\} \leq f(n)\prod^{n}_{i=1}E\exp \{sX_{i}\}.$$
(2.1)

The next lemma is very useful to prove the main results of the paper, which can be found in Hu .

### Lemma 2.2

Let $$\{\Omega,\mathscr{F},P\}$$ be a probability space, $$[T_{1},T_{2}]$$ be a closed interval on the real line. Assume that $$V(\theta)=V(\omega,\theta)$$ ($$\theta\in[T_{1},T_{2}]$$, $$\omega\in\Omega$$) is a stochastic process such that $$V(\omega,\theta)$$ is continuous for all $$\omega\in\Omega$$. If there exist numbers $$\alpha>0$$, $$r>0$$ and $$C=C(T_{1},T_{2})< \infty$$ such that

$$E\bigl|V(\theta_{1})-V(\theta_{2})\bigr|^{r}\leq C| \theta_{1}-\theta_{2}|^{1+\alpha },\quad\forall \theta_{1},\theta_{2}\in[T_{1},T_{2}],$$

then for any $$\epsilon>0$$, $$a>0$$, $$\theta_{0},\theta_{0}+\epsilon\in [T_{1},T_{2}]$$, $$\gamma\in(2,2+\alpha)$$, one has

$$P \Bigl(\sup_{\theta_{0}\leq\theta_{1},\theta_{2}\leq\theta_{0}+\epsilon }\bigl|V(\theta_{1})-V( \theta_{2})\bigr|\geq a \Bigr)\leq\frac{8C}{(\alpha -\gamma+2)(\alpha-\gamma+3)} \biggl(\frac{8\gamma}{\gamma-2} \biggr)^{r}\frac{\epsilon^{\alpha+1}}{a^{r}}.$$
(2.2)

The following are the Marcinkiewicz-Zygmund type inequality and Rosential-type inequality for WOD random variables, which play an important role in the proof.

### Lemma 2.3

(cf. Wang et al. )

Let $$p\geq1$$ and $$\{X_{n},n\geq1\}$$ be a sequence of WOD random variables with $$EX_{n}=0$$ and $$E|X_{n}|^{p}<\infty$$ for each $$n\geq1$$. Then there exist positive constants $$C_{1}(p)$$ and $$C_{2}(p)$$ depending only on p such that, for $$1\leq p\leq2$$,

$$E \Biggl\vert \sum_{i=1}^{n}X_{i} \Biggr\vert ^{p}\leq \bigl[C_{1}(p)+C_{2}(p)f(n) \bigr]\sum^{n}_{i=1}E|X_{i}|^{p},$$
(2.3)

and for $$p>2$$,

$$E \Biggl\vert \sum^{n}_{i=1}X_{i} \Biggr\vert ^{p}\leq C_{1}(p)\sum ^{n}_{i=1}E|X_{i}|^{p}+C_{2}(p)f(n) \Biggl(\sum^{n}_{i=1}E|X_{i}|^{2} \Biggr)^{p/2}.$$
(2.4)

## Main results and their proofs

Based on the useful inequalities in Section 2, we now study the large deviation results for the least squares estimator of the nonlinear regression model based on WOD errors.

### Theorem 3.1

Consider the model (1.1). Assume that there exist positive constants $$c_{1}$$, $$c_{2}$$, $$c_{3}$$, $$c_{4}$$ such that

$$c_{1}|\theta_{1}-\theta_{2}| \leq \bigl|f_{i}(\theta_{1})-f_{i}(\theta _{2})\bigr|\leq c_{2}|\theta_{1}- \theta_{2}|,\quad \forall\theta_{1},\theta _{2} \in\Theta,i\geq1,$$
(3.1)

and

$$c_{3}\leq\omega_{i}\leq c_{4}, \quad\forall i\geq1.$$
(3.2)

Let $$\{\xi_{i}, i\geq1\}$$ be a sequence of mean zero WOD random variables with $$E|\xi_{i}|^{p}<\infty$$ for some $$p>2$$. Denote

$$\Gamma_{p,n}=\sum^{n}_{i=1}E| \xi_{i}|^{p}, \quad n\geq1,$$
(3.3)

and

$$\Delta_{p,n}= \Biggl(\sum^{n}_{i=1} \bigl(E|\xi_{i}|^{p} \bigr)^{2/p} \Biggr)^{p/2},\quad n\geq1.$$
(3.4)

Then there exists a positive constant $$C(p)$$ such that

$$P \bigl(n^{1/2}|\theta_{n}- \theta_{0}|>\rho \bigr)\leq C(p) \bigl(\Gamma_{p,n}+f(n) \Delta_{p,n} \bigr)n^{-p/2}\rho^{-p}$$
(3.5)

for every $$\rho>0$$ and all $$n\geq1$$.

### Proof

Denote

$$\Psi_{n}(\theta_{1},\theta_{2})= \frac{1}{n}\sum^{n}_{i=1}\omega ^{2}_{i} \bigl(f_{i}(\theta_{1})-f_{i}( \theta_{2}) \bigr),\qquad V_{n}(\theta)=\frac{1}{n^{1/2}} \sum^{n}_{i=1}\xi_{i} \bigl(f_{i}(\theta )-f_{i}(\theta_{0}) \bigr)^{2}$$

and

$$U_{n}(\theta)=\frac{V_{n}(\theta)}{n^{1/2}\Psi_{n}(\theta,\theta_{0})},\quad \theta\ne\theta_{0}.$$

Without loss of generality, we assume that $$\omega_{i}=1$$ for all i. The general case can be obtained similarly in view of (3.2). It follows from (3.1) that

$$c^{2}_{1}(\theta_{1}-\theta_{2})^{2} \leq \Psi_{n}(\theta_{1},\theta_{2})\leq c^{2}_{2}(\theta_{1}-\theta_{2})^{2}$$
(3.6)

for all $$\theta_{1},\theta_{2}\in\theta$$ and $$n\geq1$$. Denote $$A_{n\epsilon}=[|\theta_{n}-\theta_{0}|>\epsilon]$$. If $$\epsilon\in A_{n\epsilon}$$, then $$\theta_{n}\ne\theta_{0}$$ and

\begin{aligned} \sum^{n}_{i=1}\xi^{2}_{i} \geq&\sum^{n}_{i=1} \bigl(X_{i}-f_{i}( \theta _{n}) \bigr)^{2} \\ =& \sum^{n}_{i=1} \bigl(X_{i}-f_{i}( \theta_{0}) \bigr)^{2}+ 2\sum^{n}_{i=1} \bigl(X_{i}-f_{i}(\theta_{0}) \bigr) \bigl(f_{i}(\theta _{0})-f_{i}( \theta_{n}) \bigr) \\ &{}+ \sum^{n}_{i=1} \bigl(f_{i}( \theta_{0})-f_{i}(\theta_{n}) \bigr)^{2} \\ =&\sum^{n}_{i=1}\xi^{2}_{i}-2nU_{n}( \theta_{n})\Psi_{n}(\theta_{n},\theta _{0})+n\Psi_{n}(\theta_{n},\theta_{0}), \end{aligned}

which implies that

$$\Psi_{n}(\theta_{n},\theta_{0}) \bigl(1-2U_{n}(\theta_{n}) \bigr)\leq0.$$
(3.7)

Noting that $$\theta_{n}\ne\theta_{0}$$, we have by (3.6) that $$\Psi _{n}(\theta_{n},\theta_{0})>0$$, which together with (3.7) shows that $$U_{n}(\theta_{n})\geq1/2$$. Thus, $$A_{n\epsilon}=[|\theta _{n}-\theta_{0}|>\epsilon]\subset [U_{n}(\theta_{n})\geq1/2]$$, and, for any $$\epsilon>0$$,

$$P\bigl(|\theta_{n}-\theta_{0}|>\epsilon\bigr)\leq P \Bigl(\sup_{|\theta_{n}-\theta_{0}|>\epsilon}U_{n}(\theta_{n}) \geq 1/2 \Bigr).$$
(3.8)

Putting $$\epsilon=\rho n^{-1/2}$$ in (3.8), we have

\begin{aligned} P \bigl(n^{1/2}|\theta_{n}-\theta_{0}|>\rho \bigr) \leq& P \Bigl(\sup_{|\theta_{n}-\theta_{0}|>\rho }\bigl|U_{n}(\theta)\bigr|\geq 1/2 \Bigr) \\ &{}+P \Bigl(\sup_{\rho n^{-1/2}< \bigl|\theta_{n}-\theta_{0}\bigr| \leq\rho }\bigl|U_{n}( \theta)\bigr|\geq1/2 \Bigr) \end{aligned}
(3.9)

for every $$\rho>0$$. It follows from (3.6) again that

\begin{aligned} \sup_{|\theta-\theta_{0}|>\rho}\frac{|V_{n}(\theta)|}{n^{1/2}\Psi _{n}(\theta,\theta_{0})} =& \sup_{|\theta-\theta_{0}|>\rho} \frac {|V_{n}(\theta)|}{n^{1/2}\Psi^{1/2}_{n}(\theta,\theta_{0})\Psi ^{1/2}_{n}(\theta,\theta_{0})} \\ \leq& \sup_{|\theta-\theta_{0}|>\rho}\frac{|V_{n}(\theta)|}{n^{1/2}\Psi ^{1/2}_{n}(\theta,\theta_{0})\rho c_{1}}. \end{aligned}
(3.10)

Hence,

\begin{aligned} P \Bigl(\sup_{|\theta-\theta_{0}|>\rho}\bigl|U_{n}(\theta)\bigr|\geq 1/2 \Bigr) =&P \biggl(\sup_{|\theta-\theta_{0}|>\rho}\frac {|V_{n}(\theta)|}{n^{1/2}\Psi_{n}(\theta,\theta_{0})}\geq 1/2 \biggr) \\ \leq& P \biggl(\sup_{|\theta-\theta_{0}|>\rho}\frac{|V_{n}(\theta )|}{n^{1/2}\Psi^{1/2}_{n}(\theta,\theta_{0})} \geq \frac{c_{1}\rho}{2} \biggr). \end{aligned}
(3.11)

Cauchy’s inequality yields

$$\biggl(\frac{|V_{n}(\theta)|}{n^{1/2}\Psi^{1/2}_{n}(\theta,\theta _{0})} \biggr)^{2}= \Biggl( \frac{1}{n}\sum^{n}_{i=1} \xi_{i}\frac {f_{i}(\theta)-f_{i}(\theta_{0})}{\Psi^{1/2}_{n}(\theta,\theta _{0})} \Biggr)^{2}\leq \frac{1}{n} \sum^{n}_{i=1}\xi^{2}_{i}, \quad \forall\theta\ne\theta_{0}.$$
(3.12)

Noting hat $$p>2$$, we have by Minkowski’s inequality

$$\Biggl(E \Biggl(\sum^{n}_{i=1} \xi^{2}_{i} \Biggr)^{p/2} \Biggr)^{2/p} \leq \sum^{n}_{i=1} \bigl(E| \xi_{i}|^{p} \bigr)^{2/p}.$$
(3.13)

Hence, we can obtain by Markov’s inequality, (3.11)-(3.13) and $$f(n)\geq1$$ (from the definition of WOD random variables)

\begin{aligned} P \Bigl(\sup_{|\theta-\theta_{0}|>\rho}\bigl|U_{n}(\theta)\bigr|\geq1/2 \Bigr) \leq& P \Biggl(\frac{1}{n}\sum^{n}_{i=1} \xi^{2}_{i}\geq \biggl(\frac {1}{2}c_{1} \rho \biggr)^{2} \Biggr) \\ \leq& \biggl(\frac{4}{nc^{2}_{1}\rho^{2}} \biggr)^{p/2}E \Biggl(\sum ^{n}_{i=1}\xi^{2}_{i} \Biggr)^{p/2} \\ \leq& \biggl(\frac{4}{nc^{2}_{1}\rho^{2}} \biggr)^{p/2} \Biggl(\sum ^{n}_{i=1} \bigl(E|\xi_{i}|^{p} \bigr)^{2/p} \Biggr)^{p/2} \\ \leq& C_{1}(p)n^{-p/2}\rho^{-p} \Delta_{p,n} \\ \leq& C_{1}(p)n^{-p/2}\rho^{-p}f(n) \Delta_{p,n}. \end{aligned}
(3.14)

For $$m=0,1,2,\dots,\lfloor n^{1/2}\rfloor$$, denote $$\theta(m)=\theta_{0}+\frac{\rho}{n^{1/2}}+\frac{m\rho}{\lfloor n^{1/2}\rfloor}$$, $$\rho_{m}=\theta(m)-\theta_{0}(>0)$$. It follows from (3.6) again that

\begin{aligned} \sup_{\rho_{m} \leq\theta-\theta_{0} \leq\rho_{m+1}}\bigl|U_{n}(\theta)\bigr| \leq& \sup _{\rho_{m} \leq\theta-\theta_{0} \leq \rho_{m+1}}\frac{|V_{n}(\theta)|}{n^{1/2}c_{1}^{2}(\theta-\theta _{0})^{2}} \\ \leq& \sup_{\rho_{m} \leq\theta-\theta_{0} \leq \rho_{m+1}}\frac{|V_{n}(\theta)|}{n^{1/2}c^{2}_{1}\rho^{2}_{m}}, \end{aligned}
(3.15)

and thus

\begin{aligned} P \biggl(\sup_{\rho n^{-1/2}\leq\theta-\theta_{0}\leq\rho }\bigl|U_{n}(\theta)\bigr|\geq \frac{1}{2} \biggr) \leq& \sum^{\lfloor n^{1/2}\rfloor-1}_{m=0}P \biggl(\sup_{\rho_{m}\leq\theta-\theta _{0}\leq\rho_{m+1}}\bigl|U_{n}(\theta)\bigr|\geq \frac{1}{2} \biggr) \\ \leq& \sum^{\lfloor n^{1/2}\rfloor-1}_{m=0}P \biggl(\sup _{\rho _{m}\leq \theta-\theta_{0}\leq\rho_{m+1}}\bigl|V_{n}(\theta)\bigr|\geq \frac{1}{2}c^{2}_{1} \rho^{2}_{m} n^{1/2} \biggr). \end{aligned}

Noting that

$$\sup_{\rho_{m} \leq\theta-\theta_{0} \leq \rho_{m+1}}\bigl|V_{n}(\theta)\bigr|\leq\bigl|V_{n} \bigl( \theta(m) \bigr)\bigr|+\sup_{\theta(m)\leq \theta_{1},\theta_{2}\leq\theta(m+1)}\bigl|V_{n}( \theta_{2})-V_{n}(\theta_{1})\bigr|,$$

we have

\begin{aligned}& P \biggl(\sup_{\rho_{m}\leq\theta-\theta_{0}\leq\rho _{m+1}}\bigl|V_{n}(\theta)\bigr|\geq \frac{1}{2}c^{2}_{1}\rho^{2}_{m} n^{1/2} \biggr) \\& \quad\leq P \biggl(\bigl|V_{n} \bigl(\theta(m) \bigr)\bigr|\geq \frac{1}{4}c_{1}^{2}\rho^{2}_{m} n^{1/2} \biggr) \\& \qquad{}+P \biggl(\sup_{\theta(m)\leq\theta_{1},\theta_{2}\leq \theta(m+1)}\bigl|V_{n}( \theta_{2})-V_{n}(\theta_{1})\bigr|\geq \frac{1}{4}c^{2}_{1}\rho^{2}_{m} n^{1/2} \biggr). \end{aligned}
(3.16)

In view of the definition of $$V_{n}(\theta)$$, it is easy to check that

$$V_{n} \bigl(\theta(m) \bigr)=\frac{1}{n^{1/2}}\sum ^{n}_{i=1}\xi_{i} \bigl(f_{i} \bigl(\theta (m) \bigr)-f_{i}(\theta_{0}) \bigr), V_{n}(\theta_{2})-V_{n}(\theta_{1})= \frac{1}{n^{1/2}}\sum^{n}_{i=1}\xi _{i} \bigl(f_{i}(\theta_{2})-f_{i}( \theta_{1}) \bigr).$$

By Markov’s inequality, (2.4) in Lemma 2.3, (3.1), and Hölder’s inequality, we get

\begin{aligned}& P \biggl(\bigl|V_{n} \bigl(\theta(m) \bigr)\bigr|\geq\frac{1}{4}c_{1}^{2} \rho_{m}^{2} n^{1/2} \biggr) \\& \quad\leq \biggl(\frac{4}{c_{1}^{2}\rho_{m}^{2} n^{1/2}} \biggr)^{p} E\bigl|V_{n} \bigl(\theta(m) \bigr)\bigr|^{p} \\& \quad\leq \biggl(\frac{4}{c_{1}^{2}\rho_{m}^{2} n^{1/2}} \biggr)^{p}\frac{1}{n^{p/2}} \Biggl\{ C_{1}(p)\sum^{n}_{i=1}E| \xi _{i}|^{p}\bigl|f_{i} \bigl(\theta(m) \bigr)-f_{i}(\theta_{0})\bigr|^{p} \\& \qquad{}+C_{2}(p)f(n) \Biggl(\sum^{n}_{i=1} E \xi^{2}_{i} \bigl(f_{i} \bigl(\theta(m) \bigr)-f_{i}(\theta _{0}) \bigr)^{2} \Biggr)^{p/2} \Biggr\} \\& \quad\leq \biggl(\frac{4c_{2}}{c_{1}^{2}\rho_{m}^{2} n^{1/2}} \biggr)^{p}\frac{1}{n^{p/2}}\bigl| \theta(m)-\theta_{0}\bigr|^{p} \Biggl\{ C_{1}(p)\sum ^{n}_{i=1}E|\xi_{i}|^{p} +C_{2}(p)f(n) \Biggl(\sum^{n}_{i=1}E \xi_{i}^{2} \Biggr)^{p/2} \Biggr\} \\& \quad\leq \biggl(\frac{4c_{2}}{c_{1}^{2}\rho_{m}^{2} n^{1/2}} \biggr)^{p}\frac{1}{n^{p/2}}\bigl| \theta(m)-\theta_{0}\bigr|^{p} \Biggl\{ C_{1}(p)\sum ^{n}_{i=1}E|\xi_{i}|^{p} +C_{2}(p)f(n) \Biggl(\sum^{n}_{i=1} \bigl(E|\xi_{i}|^{p} \bigr)^{2/p} \Biggr)^{p/2} \Biggr\} \\& \quad\leq C_{3}(p)\rho_{m}^{-p}n^{-p} \bigl(\Gamma_{p,n}+f(n)\Delta_{p,n} \bigr). \end{aligned}
(3.17)

On the other hand,

\begin{aligned}& E\bigl|V_{n}(\theta_{2})-V_{n}(\theta_{1})\bigr|^{p} \\& \quad\leq \frac{1}{n^{p/2}} \Biggl\{ C_{4}(p)\sum ^{n}_{i=1}E|\xi_{i}|^{p}\bigl|f_{i}( \theta _{2})-f_{i}(\theta_{1})\bigr|^{p} +C_{5}(p)f(n) \Biggl(\sum^{n}_{i=1}E \xi^{2}_{i} \bigl(f_{i}(\theta_{2})-f_{i}( \theta _{1}) \bigr)^{2} \Biggr)^{p/2} \Biggr\} \\& \quad\leq\frac{c_{2}}{n^{p/2}} \Biggl(C_{4}(p)\sum ^{n}_{i=1}E|\xi _{i}|^{p}+C_{5}(p)f(n) \Biggl(\sum^{n}_{i=1}E\xi_{i}^{2} \Biggr)^{p/2} \Biggr)|\theta_{2}-\theta_{1}|^{p} \\& \quad\leq\frac{C_{6}(p)}{n^{p/2}} \Biggl(\sum^{n}_{i=1}E| \xi_{i}|^{p}+f(n) \Biggl(\sum^{n}_{i=1} \bigl(E|\xi_{i}|^{p} \bigr)^{2/p} \Biggr)^{p/2} \Biggr)|\theta _{2}-\theta_{1}|^{p} \\& \quad=\frac{C_{6}(p)}{n^{p/2}} \bigl(\Gamma_{n,p}+f(n)\Delta_{p,n} \bigr)|\theta _{1}-\theta_{2}|^{p} \\& \quad=:C(n,p)|\theta_{1}-\theta_{2}|^{p}. \end{aligned}
(3.18)

For $$\forall~\theta_{1},\theta_{2} \in\Theta$$ and $$n\geq1$$, applying Lemma 2.2 with $$r=1+\alpha=p$$, $$C=C(n,p)$$, $$\epsilon=\rho/\lfloor n^{-1/2} \rfloor$$, $$a=\frac{1}{4}c_{1}^{2}\rho_{m}^{2}n^{1/2}$$, and $$\gamma\in (2,p+1)$$, we can obtain

\begin{aligned}& P \biggl(\sup_{\theta(m)\leq\theta_{1}, \theta_{2}\leq\theta (m+1)}\bigl|V_{n}(\theta_{2})-V_{n}( \theta_{1})\bigr|\geq\frac{1}{4}c^{2}_{1} \rho^{2}_{m} n^{1/2} \biggr) \\& \quad=P \biggl(\sup_{\theta(m)\leq\theta_{1}, \theta_{2}\leq\theta (m)+\rho/\lfloor n^{-1/2} \rfloor}\bigl|V_{n}( \theta_{2})-V_{n}(\theta_{1})\bigr| \geq \frac{1}{4}c^{2}_{1}\rho^{2}_{m} n^{1/2} \biggr) \\& \quad\leq \frac{8C_{6}(p)n^{-p/2}(\Gamma_{p,n}+f(n)\Delta_{p,n})}{(p+1-\gamma )(p+2-\gamma)} \biggl(\frac{8\gamma}{\gamma-2} \biggr)^{p} \biggl(\frac {\rho}{\lfloor n^{-1/2} \rfloor} \biggr)^{p} \biggl(\frac{4}{c^{n}_{1}\rho^{2}_{m}n^{1/2}} \biggr)^{p} \\& \quad\leq C_{7}(p)\rho^{p} \bigl( \Gamma_{p,n}+f(n)\Delta_{p,n} \bigr)n^{-3p/2} \rho_{m}^{-2p}. \end{aligned}
(3.19)

Noting that $$\rho_{0}=\rho n^{-1/2}$$, $$\rho_{m}>m\rho n^{-1/2}$$, $$p>2$$, we have by (3.16), (3.17), and (3.19) that

\begin{aligned}& P \biggl(\sup_{\rho n^{-1/2} \leq\theta-\theta_{0} \leq \rho}\bigl|U_{n}(\theta)\bigr|\geq \frac{1}{2} \biggr) \\& \quad\leq\sum_{m=0}^{\lfloor n^{1/2} \rfloor -1} \bigl\{ C_{3}(p)\rho_{m}^{-p}n^{-p} \bigl( \Gamma_{p,n}+f(n)\Delta_{p,n} \bigr) +C_{7}(p) \rho_{m}^{-2p}n^{-3p/2}\rho^{p} \bigl( \Gamma_{p,n}+f(n)\Delta _{p,n} \bigr) \bigr\} \\& \quad\leq C_{3}(p)n^{-p/2}\rho^{-p} \bigl( \Gamma_{p,n}+f(n)\Delta _{p,n} \bigr)+C_{7}(p)n^{-p/2} \rho^{-p} \bigl(\Gamma_{p,n}+f(n)\Delta_{p,n} \bigr) \\& \qquad{}+n^{-p/2}\rho^{-p} \bigl(\Gamma_{p,n}+f(n) \Delta_{p,n} \bigr)\sum_{m=1}^{\lfloor n^{1/2} \rfloor -1} \biggl(\frac{C_{3}(p)}{m^{p}}+\frac{C_{7}(p)}{m^{2p}} \biggr) \\& \quad\leq C_{8}(p)n^{-p/2}\rho^{-p} \bigl(\Gamma_{p,n}+f(n)\Delta_{p,n} \bigr). \end{aligned}
(3.20)

Similarly, we have

$$P \biggl(\sup_{\rho n^{-1/2} \leq\theta_{0}-\theta\leq \rho}\bigl|U_{n}(\theta)\bigr| \geq \frac{1}{2} \biggr) \leq C_{9}(p)n^{-p/2} \rho^{-p} \bigl(\Gamma_{p,n}+f(n)\Delta_{p,n} \bigr).$$
(3.21)

Therefore, the desired result (3.5) follows from (3.9), (3.14), (3.20), and (3.21) immediately. This completes the proof of the theorem. □

Inspired by Theorem 3.1, we will consider the case $$p\in(1,2]$$ and establish the following result.

### Theorem 3.2

Consider the model (1.1). Let the conditions (3.1) and (3.2) in Theorem  3.1 hold, and $$E|\xi_{i}|^{p}<\infty$$ for some $$p\in (1,2]$$. Denote

$$\Lambda_{p,n}=\sum_{i=1}^{n}E| \xi_{i}|^{p},\quad n\geq1.$$
(3.22)

Then there exists a positive constant $$C(p)$$ such that

$$P \bigl(n^{1/2}|\theta_{n}- \theta_{0}|>\rho \bigr)\leq C(p)f(n) \Lambda_{p,n}n^{-p/2} \rho^{-p}$$
(3.23)

for every $$\rho>0$$ and all $$n\geq1$$.

### Proof

Similar to the above proof, we have by the $$C_{r}$$ inequality, (3.1), and (3.6)

\begin{aligned} \biggl\vert \frac{V_{n}(\theta)}{n^{1/2}\Psi_{n}^{1/2}(\theta,\theta _{0})} \biggr\vert ^{p} =& \Biggl\vert \frac{1}{n}\sum^{n}_{i=1} \xi_{i}\frac {f_{i}(\theta)-f_{i}(\theta_{0})}{\Psi_{n}^{1/2}(\theta,\theta_{0})} \Biggr\vert ^{p} \\ \leq& \frac{1}{n^{p}}n^{p-1}\sum^{n}_{i=1}| \xi_{i}|^{p}\frac{|f_{i}(\theta )-f_{i}(\theta_{0})|^{p}}{\Psi_{n}^{p/2}(\theta,\theta_{0})} \\ \leq& \frac{C_{1}(p)}{n}\sum^{n}_{i=1}| \xi_{i}|^{p},\quad\forall\theta\neq\theta_{0}, \end{aligned}
(3.24)

which implies that

\begin{aligned} P \biggl(\sup_{|\theta-\theta_{0}|>\rho}\bigl|U_{n}(\theta)\bigr|\geq \frac{1}{2} \biggr) \leq& P \Biggl(\frac{C_{1}(p)}{n}\sum ^{n}_{i=1}|\xi_{i}|^{p} \geq \biggl(\frac{1}{2}c_{1}\rho \biggr)^{p} \Biggr) \\ \leq& \biggl(\frac{2}{c_{1}\rho} \biggr)^{p}\frac{C_{1}(p)}{n}\sum ^{n}_{i=1}E|\xi_{i}|^{p} \\ \leq& C_{2}(p)n^{-1}\rho^{-p} \Lambda_{p,n}. \end{aligned}
(3.25)

Similar to the proof of (3.17), we see by Markov’s inequality, (2.3) in Lemma 2.3, and (3.1) that

\begin{aligned}& P \biggl(\bigl|V_{n} \bigl(\theta(m) \bigr)\bigr|\geq\frac{1}{4}c_{1}^{2} \rho_{m}^{2}n^{1/2} \biggr) \\& \quad\leq \biggl( \frac{4c_{2}}{c_{1}^{2}\rho_{m}^{2}n^{1/2}} \biggr)^{p}\frac {1}{n^{p/2}}\bigl|\theta(m)- \theta_{0}\bigr|^{p} \bigl(C_{3}(p)+C_{4}(p)f(n) \bigr)\Lambda _{p,n} \\& \quad\leq C_{5}(p) \bigl(1+f(n) \bigr) \rho_{m}^{-p}n^{-p}\Lambda_{p,n}. \end{aligned}
(3.26)

On the other hand, for all $$\theta_{1}$$, $$\theta_{2}$$ and $$n\geq1$$, we have

$$E\bigl|V_{n}(\theta_{2})-V_{n}( \theta_{1})\bigr|^{p} \leq\frac{C_{6}(p)}{n^{p/2}} \bigl(1+f(n) \bigr) \Lambda_{p,n}|\theta_{1}-\theta_{2}|^{p} =: C(n,p)|\theta_{1}-\theta_{2}|^{p}.$$
(3.27)

In view of the proof of (3.19) and noting that $$f(n)\geq1$$, we have

\begin{aligned}& P \biggl(\sup_{\theta(m)\leq\theta_{1}, \theta_{2}\leq\theta(m+1)}\bigl|V_{n}(\theta_{2})-V_{n}( \theta_{1})\bigr|\geq \frac{1}{4}c_{1}^{2} \rho_{m}^{2}n^{1/2} \biggr) \\& \quad \leq \frac{8C_{6}(p)n^{-p/2}(1+f(n))\Lambda_{p,n}}{(p+1-\gamma)(p+2-\gamma )} \biggl(\frac{8\gamma}{\gamma-2} \biggr)^{p} \biggl(\frac{\rho }{\lfloor n^{1/2}\rfloor} \biggr)^{p} \biggl(\frac{4}{c_{1}^{2}\rho _{m}^{2}n^{1/2}} \biggr)^{p} \\& \quad\leq C_{7}(p)\rho^{p} f(n) \Lambda_{p,n}n^{-3p/2}\rho_{m}^{-2p}. \end{aligned}
(3.28)

Following a similar way, we can get the proof below:

\begin{aligned}& P \biggl(\sup_{\rho n^{-1/2}\leq\theta-\theta_{0}\leq\rho}\bigl|U_{n}(\theta)\bigr|\geq \frac {1}{2} \biggr) \\& \quad\leq \sum^{\lfloor n^{\frac{1}{2}}\rfloor-1}_{m=0} \bigl\{ C_{5}(p) \bigl(1+f(n) \bigr)\rho _{m}^{-p}n^{-p} \Lambda_{p,n} +C_{7}(p)\rho^{p} \bigl(1+f(n) \bigr) \Lambda_{p,n}n^{-3p/2}\rho_{m}^{-2p} \bigr\} \\& \quad\leq C_{5}(p) \bigl(1+f(n) \bigr)n^{-p/2} \Lambda_{p,n}\rho^{-p} +C_{7}(p) \bigl(1+f(n) \bigr)n^{-p/2}\Lambda_{p,n}\rho^{-p} \\& \qquad{}+n^{-p/2}\rho^{-p}\Lambda_{p,n} \bigl(1+f(n) \bigr) \sum^{\lfloor n^{\frac{1}{2}}\rfloor-1}_{m=1} \biggl( \frac{C_{5}(p)}{m^{p}}+ \frac {C_{7}(p)}{m^{2p}} \biggr) \\& \quad\leq C_{8}(p)n^{-p/2} f(n) \Lambda_{p,n}\rho^{-p}, \end{aligned}
(3.29)

and thus

$$P \biggl(\sup_{\rho n^{-1/2}\leq\theta_{0}-\theta\leq\rho }\bigl|U_{n}(\theta)\bigr| \geq\frac{1}{2} \biggr)\leq C(p)n^{-p/2} f(n) \Lambda_{p,n} \rho^{-p}.$$
(3.30)

Therefore, the desired result (3.23) follows from (3.9), (3.25), (3.29), and (3.30) immediately. The proof is completed. □

## References

1. Schmetterer, L: Introduction to Mathematical Statistics. Springer, Berlin (1974)

2. Ivanov, AV: An asymptotic expansion for the distribution of the least squares estimator of the nonlinear regression parameter. Theory Probab. Appl. 21(3), 557-570 (1976)

3. Hu, SH: The rate of convergence for the least squares estimator in nonlinear regression model with dependent errors. Sci. China Ser. A 45(2), 137-146 (2002)

4. Hu, SH: Consistency for the least squares estimator in nonlinear regression model. Stat. Probab. Lett. 67(2), 183-192 (2004)

5. Yang, WZ, Hu, SH: Large deviation for a least squares estimator in a nonlinear regression model. Stat. Probab. Lett. 91, 135-144 (2014)

6. Ibragimov, IA, Has’minskii, RZ: Statistical Estimation: Asymptotic Theory. Springer, New York (1981). Translated by Samuel Kotz

7. Ivanov, AV, Leonenko, NN: Statistical Analysis of Random Fields. Kluwer Academic Publishers, Dordreht/Boston/London (1989)

8. Ivanov, AV: Asymptotic Theory of Nonlinear Regression. Kluwer Academic Publishers, Dordreht/Boston/London (1997)

9. Wang, K, Wang, Y, Gao, Q: Uniform asymptotics for the finite-time ruin probability of a new dependent risk model with a constant interest rate. Methodol. Comput. Appl. Probab. 15(1), 109-124 (2013)

10. He, W, Cheng, DY, Wang, YB: Asymptotic lower bounds of precise large deviations with nonnegative and dependent random variables. Stat. Probab. Lett. 83, 331-338 (2013)

11. Chen, Y, Wang, L, Wang, YB: Uniform asymptotics for the finite-time ruin probabilities of two kinds of nonstandard bidimensional risk models. J. Math. Anal. Appl. 401(1), 114-129 (2013)

12. Shen, AT: Bernstein-type inequality for widely dependent sequence and its application to nonparametric regression models. Abstr. Appl. Anal. 2013, Article ID 862602 (2013)

13. Wang, XJ, Xu, C, Hu, TC, Volodin, A, Hu, SH: On complete convergence for widely orthant-dependent random variables and its applications in nonparametrics regression models. Test 23(3), 607-629 (2014)

## Acknowledgements

This work was supported by the National Natural Science Foundation of China (11501004, 11501005, 11526033), the Natural Science Foundation of Anhui Province (1508085J06), the Students Science Research Training Program of Anhui University (KYXL2014017), and the Research Teaching Model Curriculum of Anhui University (xjyjkc1407).

The authors are most grateful to the editor and anonymous referees for careful reading of the manuscript and valuable suggestions, which helped in improving an earlier version of this paper.

## Author information

Authors

### Corresponding author

Correspondence to Xuejun Wang.

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

## Rights and permissions 