- Research Article
- Open access
- Published:
Bounds for Tail Probabilities of the Sample Variance
Journal of Inequalities and Applications volume 2009, Article number: 941936 (2009)
Abstract
We provide bounds for tail probabilities of the sample variance. The bounds are expressed in terms of Hoeffding functions and are the sharpest known. They are designed having in mind applications in auditing as well as in processing data related to environment.
1. Introduction and Results
Let be a random sample of independent identically distributed observations. Throughout we write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ1_HTML.gif)
for the mean, variance, and the fourth central moment of , and assume that
. Some of our results hold only for bounded random variables. In such cases without loss of generality we assume that
. Note that
is a natural condition in audit applications.
The sample variance of the sample
is defined as
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ2_HTML.gif)
where is the sample mean,
. We can rewrite (1.2) as
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ3_HTML.gif)
We are interested in deviations of the statistic from its mean
, that is, in bounds for the tail probabilities of the statistic
,
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ4_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ5_HTML.gif)
The paper is organized as follows. In the introduction we give a description of bounds, some comments, and references. In Section 2 we obtain sharp upper bounds for the fourth moment. In Section 3 we give proofs of all facts and results from the introduction.
If , then the range of interest in (1.5) is
, where
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ6_HTML.gif)
The restriction on the range of
in (1.4) (resp.,
in (1.5) in cases where the condition
is fulfilled) is natural. Indeed,
for
, due to the obvious inequality
. Furthermore, in the case of
we have
for
since
(see Proposition 2.3 for a proof of the latter inequality).
The asymptotic (as ) properties of
(see Section 3 for proofs of (1.7) and (1.8)) can be used to test the quality of bounds for tail probabilities. Under the condition
the statistic
is asymptotically normal provided that
is not a Bernoulli random variable symmetric around its mean. Namely, if
, then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ7_HTML.gif)
If (which happens if and only if
is a Bernoulli random variable symmetric around its mean), then asymptotically
has
type distribution, that is,
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ8_HTML.gif)
where is a standard normal random variable, and
is the standard normal distribution function.
Let us recall already known bounds for the tail probabilities of the sample variance (see (1.19)–(1.21)). We need notation related to certain functions coming back to Hoeffding [1]. Let and
. Write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ9_HTML.gif)
For we define
. For
we set
. Note that our notation for the function
is slightly different from the traditional one. Let
. Introduce as well the function
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ10_HTML.gif)
and for
. One can check that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ11_HTML.gif)
All our bounds are expressed in terms of the function . Using (1.11), it is easy to replace them by bounds expressed in terms of the function
, and we omit related formulations.
Let and
. Assume that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ12_HTML.gif)
Let be a Bernoulli random variable such that
and
. Then
and
. The function
is related to the generating function (the Laplace transform) of binomial distributions since
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ13_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ14_HTML.gif)
where are independent copies of
. Note that (1.14) is an obvious corollary of (1.13). We omit elementary calculations leading to (1.13). In a similar way
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ15_HTML.gif)
where is a Poisson random variable with parameter
.
The functions and
satisfy a kind of the Central Limit Theorem. Namely, for given
and
we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ16_HTML.gif)
(we omit elementary calculations leading to (1.16)). Furthermore, we have [1]
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ17_HTML.gif)
and we also have [2]
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ18_HTML.gif)
Using the introduced notation, we can recall the known results (see [2, Lemma ]). Let
be the integer part of
. Assume that
. If
is known, then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ19_HTML.gif)
The right-hand side of (1.19) is an increasing function of (see Section 3 for a short proof of (1.19) as a corollary of Theorem 1.1). If
is unknown but
is known, then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ20_HTML.gif)
Using the obvious estimate , the bound (1.20) is implied by (1.19). In cases where both
and
are not known, we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ21_HTML.gif)
as it follows from (1.19) using the obvious bound .
Let us note that the known bounds (1.19)–(1.21) are the best possible in the framework of an approach based on analysis of the variance, usage of exponential functions, and of an inequality of Hoeffding (see (3.3)), which allows to reduce the problem to estimation of tail probabilities for sums of independent random variables. Our improvement is due to careful analysis of the fourth moment which appears to be quite complicated; see Section 2. Briefly the results of this paper are the following: we prove a general bound involving ,
, and the fourth moment
; this general bound implies all other bounds, in particular a new precise bound involving
and
; we provide as well bounds for lower tails
; we compare the bounds analytically, mostly as
is sufficiently large.
From the mathematical point of view the sample variance is one of the simplest nonlinear statistics. Known bounds for tail probabilities are designed having in mind linear statistics, possibly also for dependent observations. See a seminal paper of Hoeffding [1] published in JASA. For further development see Talagrand [3], Pinelis [4, 5], Bentkus [6, 7], Bentkus et al. [8, 9], and so forth. Our intention is to develop tools useful in the setting of nonlinear statistics, using the sample variance as a test statistic.
Theorem 1.1 extends and improves the known bounds (1.19)–(1.21). We can derive (1.19)–(1.21) from this theorem since we can estimate the fourth moment via various combinations of
and
using the boundedness assumption
.
Theorem 1.1.
Let and
.
If and
, then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ22_HTML.gif)
with
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ23_HTML.gif)
If and
, then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ24_HTML.gif)
with
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ25_HTML.gif)
Both bounds and
are increasing functions of
,
and
.
Remark 1.2.
In order to derive upper confidence bounds we need only estimates of the upper tail (see [2]). To estimate the upper tail the condition
is sufficient. The lower tail
has a different type of behavior since to estimate it we indeed need the assumption that
is a bounded random variable.
For Theorem 1.1 implies the known bounds (1.19)–(1.21) for the upper tail of
. It implies as well the bounds (1.26)–(1.29) for the lower tail. The lower tail has a bit more complicated structure, (cf. (1.26)–(1.29) with their counterparts (1.19)–(1.21) for the upper tail).
If is known, then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ26_HTML.gif)
One can show (we omit details) that the bound is not an increasing function of
. A bit rougher inequality
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ27_HTML.gif)
has the monotonicity property since is an increasing function of
. If
is known, then using the obvious inequality
, the bound (1.27) yields
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ28_HTML.gif)
If we have no information about and
, then using
, the bound (1.27) implies
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ29_HTML.gif)
The bounds above do not cover the situation where both and
are known. To formulate a related result we need additional notation. In case of
we use the notation
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ30_HTML.gif)
In view of the well-known upper bound for the variance of
, we can partition the set
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ31_HTML.gif)
of possible values of and
into a union
of three subsets
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ32_HTML.gif)
and ; see Figure 1.
Theorem 1.3.
Write . Assume that
.
The upper tail of the statistic satisfies
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ33_HTML.gif)
with , where
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ34_HTML.gif)
and where one can write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ35_HTML.gif)
The lower tail of satisfies
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ36_HTML.gif)
with , where
, and
is defined by (1.34).
The bounds above are obtained using the classical transform ,
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ37_HTML.gif)
of survival functions (cf. definitions (1.13) and (1.14) of the related Hoeffding functions). The bounds expressed in terms of Hoeffding functions have a simple analytical structure and are easily numerically computable.
All our upper and lower bounds satisfy a kind of the Central Limit Theorem. Namely, if we consider an upper bound, say (resp., a lower bound
) as a function of
, then there exist limits
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ38_HTML.gif)
with some positive and
. The values of
and
can be used to compare the bounds—the larger these constants, the better the bound. To prove (1.38) it suffices to note that with
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ39_HTML.gif)
The Central Limit Theorem in the form of (1.7) restricts the ranges of possible values of and
. Namely, using (1.7) it is easy to see that
and
have to satisfy
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ40_HTML.gif)
We provide the values of these constants for all our bounds and give the numerical values of them in the following two cases.
(i) is a random variable uniformly distributed in the interval
. The moments of this random variable satisfy
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ41_HTML.gif)
For defined by (1.41), the constants
and
we give as
.
(ii) is uniformly distributed in
, and in this case
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ42_HTML.gif)
For the constants and
with
defined by (1.42) we give as
.
We have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ43_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ44_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ45_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ46_HTML.gif)
while calculating the constants in (1.44) and (1.46) we choose . The quantity
in (1.43) and (1.45) is defined by (1.34).
Conclusions
Our new bounds provide a substantial improvement of the known bounds. However, from the asymptotic point of view these bounds seem to be still rather crude. To improve the bounds further one needs new methods and approaches. Some preliminary computer simulations show that in applications where is finite and random variables have small means and variances (like in auditing, where a typical value of
is
), the asymptotic behavior is not related much to the behavior for small
. Therefore bounds specially designed to cover the case of finite
have to be developed.
2. Sharp Upper Bounds for the Fourth Moment
Recall that we consider bounded random variables such that , and that we write
and
. In Lemma 2.1 we provide an optimal upper bound for the fourth moment of
given a shift
, a mean
, and a variance
. The maximizers of the fourth moment are either Bernoulli or trinomial random variables. It turns out that their distributions, say
, are of the following three types (i)–(iii):
(i)a two point distribution such that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ47_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ48_HTML.gif)
(ii)a family of three point distributions depending on such that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ49_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ50_HTML.gif)
where we write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ51_HTML.gif)
notice that (2.4) supplies a three-point probability distribution only in cases where the inequalities and
hold;
(iii)a two point distribution such that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ52_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ53_HTML.gif)
Note that the point in (2.2)–(2.7) satisfies
and that the probability distribution
has mean
and variance
.
Introduce the set
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ54_HTML.gif)
Using the well-known bound valid for
, it is easy to see that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ55_HTML.gif)
Let . We represent the set
as a union
of three subsets setting
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ56_HTML.gif)
and , where
and
are given in (2.5). Let us mention the following properties of the regions.
(a)If , then
since for such
obviously
for all
. The set
is a one-point set. The set
is empty.
(b)If , then
since for such
clearly
for all
. The set
is a one-point set. The set
is empty.
For all three regions
,
,
are nonempty sets. The sets
and
have only one common point
, that is,
.
Lemma 2.1.
Let . Assume that a random variable
satisfies
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ57_HTML.gif)
Then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ58_HTML.gif)
with a random variable satisfying (2.11) and defined as follows:
(i)if , then
is a Bernoulli random variable with distribution (2.2);
(ii)if , then
is a trinomial random variable with distribution (2.4);
(iii)if , then
is a Bernoulli random variable with distribution (2.7).
Proof.
Writing , we have to prove that if
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ59_HTML.gif)
then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ60_HTML.gif)
with . Henceforth we write
, so that
can assume only the values
,
,
with probabilities
,
,
defined in (2.2)–(2.7), respectively. The distribution
is related to the distribution
as
for all
.
Formally in our proof we do not need the description (2.17) of measures satisfying (2.15). However, the description helps to understand the idea of the proof. Let
and
. Assume that a signed measure
of subsets of
is such that the total variation measure
is a discrete measure concentrated in a three-point set
and
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ61_HTML.gif)
Then is a uniquely defined measure such that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ62_HTML.gif)
satisfy
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ63_HTML.gif)
We omit the elementary calculations leading to (2.17). The calculations are related to solving systems of linear equations.
Let . Consider the polynomial
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ64_HTML.gif)
It is easy to check that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ65_HTML.gif)
The proofs of (i)–(iii) differ only in technical details. In all cases we find ,
, and
(depending on
,
and
) such that the polynomial
defined by (2.18) satisfies
for
, and such that the coefficient
in (2.18) vanishes,
. Using
, the inequality
is equivalent to
, which obviously leads to
. We note that the random variable
assumes the values from the set
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ66_HTML.gif)
Therefore we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ67_HTML.gif)
which proves the lemma.
(i)Now . We choose
and
. In order to ensure
(cf. (2.19)) we have to take
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ68_HTML.gif)
If , then
for all
. The inequality
is equivalent to
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ69_HTML.gif)
To complete the proof we note that the random variable with
defined by (2.2) assumes its values in the set
. To find the distribution of
we use (2.17). Setting
in (2.17) we obtain
and
,
as in (2.2).
(ii)Now or, equivalently
and
. Moreover, we can assume that
since only for such
the region
is nonempty. We choose
and
. Then
for all
. In order to ensure
(cf. (2.19)) we have to take
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ70_HTML.gif)
By our construction . To find a distribution of
supported by the set
we use (2.17). It follows that
has the distribution defined in (2.4).
(iii)We choose and
. In order to ensure
(cf. (2.19)) we have to take
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ71_HTML.gif)
If , then
for all
. The inequality
is equivalent to
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ72_HTML.gif)
To conclude the proof we notice that the random variable with
given by (2.7) assumes values from the set
.
To prove Theorems 1.1 and 1.3 we apply Lemma 2.1 with . We provide the bounds of interest as Corollary 2.2. To prove the corollary it suffices to plug
in Lemma 2.1 and, using (2.2)–(2.7), to calculate
explicitly. We omit related elementary however cumbersome calculations. The regions
,
, and
are defined in (1.32).
Corollary 2.2.
Let a random variable have mean
and variance
. Then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ73_HTML.gif)
Proposition 2.3.
Let . Then, with probability
, the sample variance satisfies
with
given by (1.6).
Proof.
Using the representation (1.3) of the sample variance as an -statistic, it suffices to show that the function
,
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ74_HTML.gif)
in the domain
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ75_HTML.gif)
satisfies . The function
is convex. To see this, it suffices to check that
restricted to straight lines is convex. Any straight line can be represented as
with some
. The convexity of
on
is equivalent to the convexity of the function
of the real variable
. It is clear that the second derivative
is nonnegative since
. Thus both
and
are convex.
Since both and
are convex, the function
attains its maximal value on the boundary of
. Moreover, the maximal value of
is attained on the set of extremal points of
. In our case the set of the extremal points is just the set of vertexes of the cube
. In other words, the maximal value of
is attained when each of
is either
or
. Since
is a symmetric function, we can assume that the maximal value of
is attained when
and
with some
. Using (2.28), the corresponding value of
is
. Maximizing with respect to
we get
, if
is even, and
, if
is odd, which we can rewrite as the desired inequality
.
3. Proofs
We use the following observation which in the case of an exponential function comes back to Hoeffding [1, Section ]. Assume that we can represent a random variable, say
, as a weighted mixture of other random variables, say
, so that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ76_HTML.gif)
where are nonrandom numbers. Let
be a convex function. Then, using Jensen's inequality
, we obtain
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ77_HTML.gif)
Moreover, if random variables are identically distributed, then
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ78_HTML.gif)
One can specialize (3.3) for -statistics of the second order. Let
be a symmetric function of its arguments. For an i.i.d. sample
consider the
-statistic
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ79_HTML.gif)
Write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ80_HTML.gif)
Then (3.3) yields
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ81_HTML.gif)
for any convex function . To see that (3.6) holds, let
be a permutation of
. Define
as (3.5) replacing the sample
by its permutation
. Then (see [1, Section
])
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ82_HTML.gif)
which means that allows a representation of type (3.1) with
and all
identically distributed, due to our symmetry and i.i.d. assumptions. Thus, (3.3) implies (3.6).
Using (1.3) we can write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ83_HTML.gif)
with . By an application of (3.6) we derive
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ84_HTML.gif)
for any convex function , where
is a sum of i.i.d. random variables such that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ85_HTML.gif)
Consider the following three families of functions depending on parameters :
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ86_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ87_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ88_HTML.gif)
Any of functions given by (3.11) dominates the indicator function
of the interval
. Therefore
. Combining this inequality with (3.9), we get
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ89_HTML.gif)
with being a sum of
i.i.d. random variables specified in (3.10). Depending on the choice of the family of functions
given by (3.11), the
in (3.14) is taken over
or
, respectively.
Proposition 3.1.
One has
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ90_HTML.gif)
If , then
.
Proof.
Let us prove (3.15). Using the i.i.d. assumption, we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ91_HTML.gif)
Let us prove that . If
, then
. Using (3.15) we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ92_HTML.gif)
which yields the desired bound for .
Proposition 3.2.
Let be a bounded random variable such that
with some nonrandom
. Then for any convex function
one has
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ93_HTML.gif)
where is a Bernoulli random variable such that
and
.
If for some
, and
,
, then (3.18) holds with
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ94_HTML.gif)
and a Bernoulli random variable such that
,
,
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ95_HTML.gif)
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_IEq409_HTML.gif)
.
Proof.
See [2, Lemmas and
].
Proof of Theorem 1.1.
The proof is based on a combination of Hoeffding's observation (3.6) using the representation (3.8) of as a
-statistic, of Chebyshev's inequality involving exponential functions, and of Proposition 3.2. Let us provide more details. We have to prove (1.22) and (1.24).
Let us prove (1.22). We apply (3.14) with the family (3.13) of exponential functions . We get
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ96_HTML.gif)
By (3.10), the sum is a sum of
copies of a random variable, say
, such that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ97_HTML.gif)
We note that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ98_HTML.gif)
Indeed, the first two relations in (3.23) are obvious; the third one is implied by ,
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ99_HTML.gif)
and ; see Proposition 3.1.
Let stand for the class of random variables
satisfying (3.23). Taking into account (3.21), to prove (1.22) it suffices to check that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ100_HTML.gif)
where is a sum of
independent copies
of
. It is clear that the left-hand side of (3.25) is an increasing function of
. To prove (3.25), we apply Proposition 3.2. Conditioning
times on all random variables except one, we can replace all random variables
by Bernoulli ones. To find the distribution of the Bernoulli random variables we use (3.23). We get
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ101_HTML.gif)
where is a sum of
independent copies of a Bernoulli random variable, say
, such that
and
with
as in (1.23), that is,
. Note that in (3.26) we have the equality since
.
Using (3.26) we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ102_HTML.gif)
To see that the third equality in (3.27) holds, it suffices to change the variable by
. The fourth equality holds by definition (1.13) of the Hoeffding function since
is a Bernoulli random variable with mean zero and such that
. The relation (3.27) proves (3.25) and (1.22).
A proof of (1.24) repeats the proof of (1.22) replacing everywhere and
by
and
, respectively. The inequality
in (3.23) has to be replaced by
, which holds due to our assumption
. Respectively, the probability
now is given by (1.25).
Proof.
The bound is an obvious corollary of Theorem 1.1 since by Proposition 3.1 we have , and therefore we can choose
. Setting this value of
into (1.22), we obtain (1.19).
Proof.
To prove (1.26), we set in (1.24). Such choice of
is justified in the proof of (1.19).
To prove (1.27) we use (1.26). We have to prove that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ103_HTML.gif)
and that the right-hand side of (3.28) is an increasing function of . By the definition of the Hoeffding function we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ104_HTML.gif)
where is a Bernoulli random variable such that
and
. It is easy to check that
assumes as well the value
with probability
. Hence
. Therefore
, and we can write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ105_HTML.gif)
where is the class of random variables
such that
and
. Combining (3.29) and (3.30) we obtain
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ106_HTML.gif)
The definition of the latter in (3.31) shows that the right-hand side of (3.31) is an increasing function of
. To conclude the proof of (1.27) we have to check that the right-hand sides of (3.28) and (3.31) are equal. Using (3.18) of Proposition 3.2, we get
, where
is a mean zero Bernoulli random variable assuming the values
and
with positive probabilities such that
. Since
, we have
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ107_HTML.gif)
Using the definition of the Hoeffding function we see that the right-hand sides of (3.28) and (3.31) are equal.
Proof of Theorem 1.3.
We use Theorem 1.1. In bounds of this theorem we substitute the value of being the right-hand side of (2.27), where a bound of type
is given. We omit related elementary analytical manipulations.
Proof.
To describe the limiting behavior of we use Hoeffding's decomposition. We can write
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ108_HTML.gif)
with kernels and
such that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ109_HTML.gif)
To derive (3.33), use the representation of as a
-statistic (3.8). The kernel functions
and
are degenerated, that is,
and
for all
. Therefore
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ110_HTML.gif)
with
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ111_HTML.gif)
It follows that in cases where the statistic
is asymptotically normal:
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ112_HTML.gif)
where is a standard normal random variable. It is easy to see that
if and only if
is a Bernoulli random variable symmetric around its mean. In this special case we have
, and (3.33) turns to
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ113_HTML.gif)
where are i.i.d. Rademacher random variables. It follows that
![](http://media.springernature.com/full/springer-static/image/art%3A10.1155%2F2009%2F941936/MediaObjects/13660_2009_Article_2039_Equ114_HTML.gif)
which completes the proof of (1.7) and (1.8).
References
Hoeffding W: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 1963, 58: 13–30. 10.2307/2282952
Bentkus V, van Zuijlen M: On conservative confidence intervals. Lithuanian Mathematical Journal 2003,43(2):141–160. 10.1023/A:1024210921597
Talagrand M: The missing factor in Hoeffding's inequalities. Annales de l'Institut Henri Poincaré B 1995,31(4):689–702.
Pinelis I: Optimal tail comparison based on comparison of moments. In High Dimensional Probability (Oberwolfach, 1996), Progress in Probability. Volume 43. Birkhäuser, Basel, Switzerland; 1998:297–314.
Pinelis I: Fractional sums and integrals of -concave tails and applications to comparison probability inequalities. In Advances in Stochastic Inequalities (Atlanta, Ga, 1997), Contemporary Mathematics. Volume 234. American Mathematical Society, Providence, RI, USA; 1999:149–168.
Bentkus V: A remark on the inequalities of Bernstein, Prokhorov, Bennett, Hoeffding, and Talagrand. Lithuanian Mathematical Journal 2002,42(3):262–269. 10.1023/A:1020221925664
Bentkus V: On Hoeffding's inequalities. The Annals of Probability 2004,32(2):1650–1673. 10.1214/009117904000000360
Bentkus V, Geuze GDC, van Zuijlen M: Trinomial laws dominating conditionally symmetric martingales. Department of Mathematics, Radboud University Nijmegen; 2005.
Bentkus V, Kalosha N, van Zuijlen M: On domination of tail probabilities of (super)martingales: explicit bounds. Lithuanian Mathematical Journal 2006,46(1):3–54.
Acknowledgment
Figure 1 was produced by N. Kalosha. The authors thank him for the help. The research was supported by the Lithuanian State Science and Studies Foundation, Grant no. T-15/07.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Bentkus, V., Van Zuijlen, M. Bounds for Tail Probabilities of the Sample Variance. J Inequal Appl 2009, 941936 (2009). https://doi.org/10.1155/2009/941936
Received:
Accepted:
Published:
DOI: https://doi.org/10.1155/2009/941936