# Approximate moments of extremes

## Abstract

Let $$M_{n, i}$$ be the ith largest of a random sample of size n from a cumulative distribution function F on $$\mathbb{R} = (-\infty, \infty)$$. Fix $$r \geq1$$ and let $$\mathbf{M}_{n} = ( M_{n, 1}, \ldots, M_{n, r} )^{\prime}$$. If there exist $$b_{n}$$ and $$c_{n} > 0$$ such that as $$n \rightarrow \infty$$, $$( M_{n, 1} - b_{n} ) / c_{n} \stackrel{\mathcal {L}}{\rightarrow} Y_{1} \sim G$$ say, a non-degenerate distribution, then as $$n \rightarrow\infty$$, $$\mathbf{Y}_{n} = ( \mathbf{M}_{n} - b_{n} {\mathbf{1}}_{r} ) / c_{n} \stackrel{\mathcal{L}}{\rightarrow} \mathbf{Y}$$, where for $$Z_{i} = -\log G ( Y_{i} )$$, $$\mathbf{Z} = ( Z_{1}, \ldots, Z_{r} )^{\prime}$$ has joint probability density function $$\exp ( -z_{r} )$$ on $$0 < z_{1} < \cdots< z_{r} < \infty$$ and $${\mathbf{1}}_{r}$$ is the r-vector of ones. The moments of Y are given for the three possible forms of G. First approximations for the moments of $$\mathbf{M}_{n}$$ are obtained when these exist.

## Introduction

For $$1 \leq i \leq n$$ let $$M_{n, i}$$ be the ith largest of a random sample of size n from a cumulative distribution function F on $$\mathbb{R}$$. Fix $$r \geq1$$ and let $$\mathbf{M}_{n} = ( M_{n, 1}, \ldots, M_{n, r} )^{\prime}$$ and $$\mathbf{Y}_{n} = ( Y_{n, 1}, \ldots, Y_{n, r} )^{\prime}$$, where

\begin{aligned} Y_{n, i} = ( M_{n, i} - b_{n, i} ) / c_{n, i}, \end{aligned}
(1)

and $$b_{n, i}, c_{n, i} > 0$$ are constants. There are three possible non-degenerate limits in distribution for $$\mathbf{Y}_{n}$$, say $$\mathbf{Y} = \mathbf{Y}_{I} = ( Y_{I, 1}, \ldots, Y_{I, r} )^{\prime}$$ for $$I = 1,2,3$$. We take $$b_{n, i} \equiv b_{n, 1} = b_{n}$$ say and $$c_{n, i}\equiv c_{n, 1} = c_{n}$$ say. The notation $$\mathbf{Y}_{I}$$ conflicts with $$\mathbf{Y}_{n}$$ but is simple and clear as dependence on n is always indicated by a subscript n.

The need for approximations for the moments of $$\mathbf{M}_{n}$$ based on the moments of Y arises in many applied areas. For example, in solutions of stochastic traveling salesman problems (Leipala ); modeling fire protection and insurance problems (Ramachandran ); modeling extreme wind gusts (Revfeim and Hessell ); determining failures of jack-ups under environmental loading (van de Graaf et al. ); and determining the asymptotic cost of algorithms and combinatorial structures such as trie, digital search tree, leader election, adaptive sampling, counting algorithms, trees related to the register function, composition of integers, some structures represented by Markov chains (column-convex polyominoes, Carlitz compositions), runs and number of distinct values of some multiplicity-in sequences of geometrically distributed random variables (Louchard and Prodinger ).

The aim of this note is to provide approximations for the moments of $$\mathbf{M}_{n}$$. Most of the results presented are new to the best of our knowledge. The results are organized as follows. In Section 2, we briefly review the possible forms for the non-degenerate limit $$\mathbf{Y}_{I} = ( Y_{I, 1}, \ldots, Y_{I, r} )^{\prime}$$ for $$I = 1,2,3$$. For $$r=1$$ these forms are well known. For example, $$P ( Y_{1, 1} \leq x ) = \exp \{ -\exp (-x) \}$$ on $$\mathbb{R}$$. In Section 3, we give the moments $$m_{\mathbf{j}} (\mathbf{Y} ) = \mathbb{E} [\mathbf{Y}^{\mathbf{j}} ]$$ and $$\mu_{\mathbf{j}} (\mathbf{Y}) = \mathbb{E} \{ [ \mathbf{Y} - \mathbb {E} (\mathbf{Y}) ]^{\mathbf{j}} \}$$ for each of these forms. Here, $$\mathbf{y}^{\mathbf{j}} = \prod_{i=1}^{r} y_{i}^{j_{i}}$$. For example, letting $$\Gamma(\cdot)$$ denote the gamma function and $$\psi(z) = (d/dz) \log\Gamma(z)$$ the digamma function, we show that

\begin{aligned} \mathbb{E} \bigl[ Y_{I, r}^{j} \bigr] = \left \{ \textstyle\begin{array}{@{}l@{\quad}l} (-1)^{j} \Gamma^{(j)} (r) / (r-1) !, & I=1,\\ M_{r} (-j \alpha^{-1} ) \mbox{ for } r > j \alpha^{-1}, & I=2,\\ (-1)^{j} M_{r} (j \alpha^{-1} ), & I=3, \end{array}\displaystyle \right . \end{aligned}

where

\begin{aligned} M_{r} (t) = \Gamma(r+t) / (r - 1) ! \end{aligned}
(2)

for $$r + \Re ( e (t) ) > 0$$, $$\Re(z)$$ denoting the real part of z, and $$\omega^{(j)} (\cdot)$$ denoting $$\omega^{(j)} (t) = d^{j} \omega(t) / d t^{j}$$. We also show that for $$1 \leq s \leq r$$, the covariance between $$Y_{I, s}$$ and $$Y_{I, r}$$ is

\begin{aligned} \operatorname{covar} ( Y_{I, s}, Y_{I, r} ) = \left \{ \textstyle\begin{array}{@{}l@{\quad}l} \psi^{(1)} (r), & I=1,\\ c_{s, r} (\alpha) \mbox{ for } r > 2 \alpha^{-1}, s > \alpha^{-1}, & I=2,\\ c_{s, r} (-\alpha), & I=3, \end{array}\displaystyle \right . \end{aligned}

where $$c_{s, r} (\alpha) = \Gamma (r-2\alpha^{-1} ) / \{ (s-1)! (s-\alpha^{-1} )_{r-s} \} - M_{r} (-\alpha^{-1} ) M_{s} (-\alpha^{-1} )$$, $$(x)_{j} = x (x+1) \cdots(x + j - 1)$$, and that the $$\mathbf{j.}$$th order cumulant of $$\mathbf{Y}_{1}$$ is

\begin{aligned} \kappa_{\mathbf{j}} ( \mathbf{Y}_{1} ) = (-1)^{\mathbf{j.}} \psi^{ ( \mathbf{j.} - 1 )} (r) \end{aligned}
(3)

for $$j_{r} \neq0$$, where $$\mathbf{j.} = \sum_{i=1}^{r} j_{i}$$. This is a remarkable result as (3) does not depend on j except via $$\mathbf{j.}$$. These results appear to be all new except for $$\mathbb{E} [ Y_{1, r} ]$$ and the moments of $$Y_{2, 1}$$, $$Y_{3, 1}$$.

In Section 4, we give $$b_{n, 1}$$, $$c_{n, 1}$$, and first approximations for $$\mathbf{M}_{n}$$ and its moments when these exist, for the most important classes of tail behavior for F. These classes cover all the examples we have come across. For $$r=1$$ these results extend those of McCord  and verify a conjecture of his concerning $$\mu_{2} ( M_{n, 1} )$$.

As well as the notation above, we use $$n_{1} = \log n$$, $$n_{2}=\log n_{1}$$, $$\langle x \rangle_{j} = x (x-1) \cdots(x - j + 1)$$, $$\operatorname{var} (Z) =$$ variance of Z, J= Jacobian, $$\mathbb{C} =$$ the set of all complex numbers, $$\operatorname{corr} ( Z_{1}, Z_{2} ) =$$ the correlation between $$Z_{1}$$ and $$Z_{2}$$, $$\rho_{s, r} =$$ the correlation between the sth and rth components of a vector of variables, and $$\sup_{n} =$$ supremum over all possible values of n. Also $$\alpha(x) \approx\beta(x)$$ means that $$\alpha(x)$$ and $$\beta (x)$$ are approximately equal.

## A brief review

Fisher and Tippett  showed that if $$Y_{n, 1}$$ of (1) has a non-degenerate limit in distribution, $$Y_{1}$$, then for some b and $$c > 0$$, $$( Y_{1} - b ) /c \sim G (y)$$ has only three possible forms, known as EV1, EV2, and EV3: $$Y_{1, 1} \sim G_{1}(y) = \exp \{ -\exp(-y) \}$$ on $$\mathbb{R}$$ or $$Y_{2, 1} \sim G_{2} (y) = G_{2} (y, \alpha) = \exp (-y^{-\alpha } )$$ on $$(0,\infty)$$, where $$\alpha> 0$$, or $$Y_{3, 1} \sim G_{3} (y) = G_{3} (y, \alpha) = \exp \{ -(-y)^{\alpha} \}$$ on $$(-\infty, 0)$$, where $$\alpha> 0$$. Note the representation

\begin{aligned} Y_{2, 1} = \exp ( Y_{1, 1} / \alpha ), \qquad Y_{3, 1} = -Y_{2, 1}^{-1} = -\exp ( \alpha/ Y_{1, 1} ). \end{aligned}
(4)

Since we may take $$c=1$$ and $$b=0$$ without loss of generality, this gives the representation (with $$b_{n} = b_{n, 1}$$, $$c_{n} = c_{n, 1}$$) $$M_{n, 1} = b_{n} + c_{n} (Y_{1} + o_{p} (1) )$$. So, for example, by Theorem 5.4 of Billingsley , $$\mathbb{E} [ M_{n, 1} ] = b_{n} + c_{n} \{ \mathbb{E} [Y_{1} ] + o (1) \}$$ when $$\mathbb{E} [ Y_{n, 1}^{2} ]$$ is bounded and $$\mu_{j} (M_{n, 1} ) = c_{n}^{j} \{ \mu_{j} (Y_{1} ) + o (1) \}$$ when $$\mathbb{E} [ \vert Y_{n, 1} \vert ^{j+\varepsilon} ]$$ is bounded for some $$\varepsilon> 0$$. Fisher and Tippett  also essentially showed that

\begin{aligned} \mathbb{E} \bigl[ \exp ( tY_{1, 1} ) \bigr] = \Gamma(1-t) \end{aligned}
(5)

for $$\Re ( e (t) ) < 1$$ so that $$\mathbb{E} [ Y_{1, 1}^{j} ] = (-1)^{j} \Gamma^{(j)} (1)$$ for $$j \geq0$$, $$\mathbb{E} [ Y_{1, 1} ] = -\psi(1) = \gamma$$, Euler’s constant, and for $$j \geq2$$, $$\kappa_{j} ( Y_{1, 1} ) = (j-1) ! S_{j}$$, where $$S_{j} = \sum_{i=1}^{\infty} i^{-j}$$ since $$\log\Gamma(1-t) = \gamma t + \sum_{j=2}^{\infty} S_{j} t^{j} / j$$. For example, $$S_{2} = \pi^{2} / 6 = \psi^{(1)} (1) = 1.6449 \cdots$$.

Gumbel , p.174 tabled $$\mu_{j}$$ and $$\kappa_{j}$$ for $$Y_{1, 1}$$ for $$j \leq6$$. From (4), (5) we obtain

\begin{aligned} \mathbb{E} \bigl[ Y^{t}_{2, 1} \bigr] = \Gamma ( 1 - t/ \alpha ) \end{aligned}

for $$\Re ( e (t) ) < \alpha$$ and

\begin{aligned} \mathbb{E} \bigl[ ( -Y_{3, 1} )^{t} \bigr] = \Gamma ( 1 + t / \alpha ) \end{aligned}

for $$\Re ( e (t) ) > -\alpha$$, as noted on p.264 and 281 of Gumbel . For example, $$\mathbb{E} [ Y_{2, 1} ] = \Gamma (1 - \alpha^{-1} )$$ if $$\alpha> 1$$, $$\operatorname{var} [ Y_{2, 1} ] = \Gamma (1-2 \alpha^{-1} ) - \Gamma (1 - \alpha^{-1} )^{2}$$ if $$\alpha> 2$$, $$\mathbb{E} [ Y_{3, 1} ] = -\Gamma (1 + \alpha ^{-1} )$$ and $$\operatorname{var} [Y_{3, 1} ] = \Gamma (1 + 2 \alpha^{-1} ) - \Gamma (1 + \alpha^{-1} )^{2}$$.

If we have N observations distributed as $$Y_{n, 1}$$ we can estimate $$c_{n}$$, $$b_{n}$$, α by assuming a choice of I and using maximum likelihood or moment estimates on

\begin{aligned} P ( M_{n, 1} \leq x ) \approx G \bigl( ( x-b_{n} ) / c_{n} \bigr), \end{aligned}
(6)

where $$G=G_{I}$$. Gumbel  does this in Sections 6.2.5-6.2.6 for $$I=1$$ for the maximum likelihood method, and in Sections 7.3.1-7.3.2 for $$I=3$$ for the moment method. See also Kotz and Johnson , p.609. A better approach is to avoid assuming a choice of I (but still assuming that a non-degenerate limit exists) by replacing G in (6) by the generalized extreme value cumulative distribution function:

\begin{aligned} G (y) = \exp \bigl[ - \bigl\{ 1 - k (y - \xi)/ \sigma \bigr\} ^{1/k} \bigr], \end{aligned}
(7)

where $$\sigma>0$$, $$y < \xi+ \sigma/k$$ if $$k > 0$$, $$y > \xi + \sigma/k$$ if $$k < 0$$, σ denotes the scale parameter, ξ denotes the location parameter and k denotes the shape parameter. Furthermore, $$G (y) = G_{1} (y)$$ as $$k \rightarrow0$$ if $$\xi=0$$, $$\sigma=1$$; $$G (y) = G_{2} (y)$$ if $$\xi= 1$$, $$\sigma= 1 / \alpha$$, $$k = -1 / \alpha$$; $$G (y) = G_{3} (y)$$ if $$\xi= -1$$, $$\sigma=k=1/\alpha$$. For $$n \leq100$$, an estimation of ξ, σ, k using L-moments is more efficient than using maximum likelihood estimates: see Hosking et al. .

Gnedenko  gave necessary and sufficient conditions on F that $${\mathcal{L} } Y_{n, 1} \rightarrow G_{1}$$ for $$I=1,2\mbox{ or }3$$. This is Theorem 1.6.2 of Leadbetter et al. ; appropriate choices of $$b_{n}$$ and $$a_{n} = c_{n}^{-1}$$ in each is given by their Corollary 1.6.3. A choice of $$b_{n}$$, $$c_{n}$$ can be found by setting $$u_{n} = b_{n} + c_{n} x$$ in their result (p.13) so that

\begin{aligned} n \bigl[ 1-F ( u_{n} ) \bigr] \rightarrow t \in[0, \infty] \quad\mbox{if and only if}\quad P ( M_{n, 1} \leq u_{n} ) \rightarrow\exp(t). \end{aligned}
(8)

They gave the examples in Table 1. (The logistic is from Stuart and Ord .) Recall $$n_{1} = \log n$$ and $$n_{2}=\log n_{1}$$. In Table 1,

\begin{aligned} B_{n} = ( 2n_{1} )^{1/2} \bigl\{ 1 - ( n_{2}+\log4 \pi ) / ( 4 n_{1} ) \bigr\} \end{aligned}
(9)

and $$B_{n}^{\prime} = \exp ( B_{n} )$$. Leadbetter et al.  also noted that no non-degenerate limit exists for an F Poisson or a geometric cumulative distribution function. The last three results in Table 1 are exact, not asymptotic, since if $$F = G_{I}$$, then

\begin{aligned} F(x)^{n} = \left \{ \textstyle\begin{array}{@{}l@{\quad}l} G_{1} ( x+n_{1} ), & \mbox{if } I=1,\\ G_{2} ( n^{-1/\alpha}x ), &\mbox{if } I=2,\\ G_{3} ( n^{1/\alpha} x ), & \mbox{if } I=3. \end{array}\displaystyle \right . \end{aligned}
(10)

For the convergence rate of $$Y_{n, 1}$$ to $$Y_{1}$$, see Balkema and de Haan  and references therein. For an estimate of α, see Dekkers and de Haan  and Dekkers et al.  and references therein.

We now consider extensions to $$M_{n, i}$$, where $$i \geq1$$. Leadbetter et al. , p.33 showed that (8) implies $$P ( M_{n, i} \leq u_{n} ) \rightarrow\exp(-t) \sum_{j=0}^{i-1} t^{j} / j!$$, so that if $$Y_{n, 1} \stackrel{\mathcal{L}}{\rightarrow} Y \sim G (x)$$ then for $$i \geq1$$ with $$b_{n, i} \equiv b_{n, 1}$$, $$c_{n, i} \equiv c_{n, 1}$$

\begin{aligned} Y_{n, i} \stackrel{\mathcal{L}}{\rightarrow} Y_{i, G} \sim G(x) \sum_{j=0}^{i-1} \bigl(-\log G(x) \bigr)^{j} / j!, \end{aligned}

that is,

\begin{aligned} Z_{n, i} = -\log G ( Y_{n, i} ) \stackrel{\mathcal {L}}{ \rightarrow} \operatorname{gamma} (i), \end{aligned}
(11)

where $$\operatorname{gamma}(\gamma)$$ has probability density function $$z^{\gamma-1} \exp(-z) / \Gamma(\gamma)$$ on $$(0,\infty)$$. By Theorem 2.3.1 in Leadbetter et al. , if $$Y_{n, 1} \stackrel{\mathcal{L}}{\rightarrow} Y_{1} \sim G$$ then $$\mathbf{Z}_{n} = ( Z_{n, 1}, \ldots, Z_{n, r} )$$ of (11) satisfies

\begin{aligned} P ( \mathbf{Z}_{n} \geq\mathbf{z} ) \rightarrow\sum p ( \mathbf{z}, { \mathbf{k}} ) \end{aligned}
(12)

summing over $$k_{1}$$ to $$k_{i}$$, $$0 \leq k_{i}$$, $$1 \leq i \leq r$$, where

\begin{aligned} p ( \mathbf{z}, \mathbf{k} ) = \exp ( -z_{r} ) \prod _{i=1}^{r} ( z_{i} - z_{i-1} )^{k_{i}} / k_{i} !, \end{aligned}

where $$z_{0} = 0$$, and the support is $$0 < z_{1} < z_{2} < \cdots< z_{r} < \infty$$. So,

\begin{aligned} \mathbf{Z}_{n} \stackrel{\mathcal{L}}{\rightarrow} \mathbf{Z} \mbox{ with joint probability density function } \exp ( -z_{r} ) \end{aligned}
(13)

on $$0 < z_{1} < \cdots<z_{r} < \infty$$, $$( G (Y_{n, 1} ), \ldots, G (Y_{n, r} ) ) \stackrel{\mathcal{L}}{\rightarrow} \mathbf{U}$$ say with joint probability density function $$( u_{1} \cdots u_{r-1} )^{-1}$$ on $$0 < u_{r} < \cdots< u_{1} < 1$$, and

\begin{aligned} \mathbf{Y}_{n} \stackrel{\mathcal{L}}{\rightarrow} \mathbf{Y}_{G} \end{aligned}
(14)

with joint probability density function $$G (y_{r} ) J$$ on $$-\infty< y_{r} < \cdots< y_{1} < \infty$$, where $$J= \vert \prod^{r}_{i=1} \partial z_{i} / \partial y_{i} \vert$$ for $$z_{i} = - \log G (y_{i} )$$. If G is the generalized extreme value cumulative distribution function of (7) this joint probability density function reduces to (5) of Tawn  and (2.2) of Smith . For convergence rates in (14), see Omey and Rachev . The results (12)-(14) are not given, although (14) is easily proven for $$r=2$$ from their Theorem 2.3.2. From (13), $$\{ Z_{r_{i}} - Z_{r_{i-1}} \}$$ are independently distributed as $$\{ \operatorname{gamma} (r_{i} - r_{i-1} ) \}$$. If $$F=G_{I}$$ for $$I=1, 2$$ or 3 then $$\stackrel{\mathcal {L}}{\rightarrow}$$ in (14) can be replaced by $$\stackrel{\mathcal{L}}{=}$$ with $$G=G_{I}$$. For $$r=1$$ this is just (10).

For $$1 \leq I \leq3$$, set

\begin{aligned} \mathbf{Y}_{I} = ( Y_{1, 1}, \ldots, Y_{I, r} )^{\prime} = \mathbf{Y}_{G} \end{aligned}
(15)

for $$G=G_{I}$$. By Section 14.20 of Stuart and Ord , for $$r \geq1$$ and $$M_{r}$$ of (2), $$\mathbb{E} [ \exp ( Y_{i, r} t ) ] = M_{r} (-t)$$, so $$\mathbb{E} [ Y_{1, r}^{j} ] = (-1)^{j} \Gamma^{(j)} (r) / (r-1)!$$ and $$\kappa_{j} ( Y_{r} ) = (-1)^{j} \psi^{(j-1)} (r)$$, which they table for $$j = 1, 2$$ and $$r = 1, 3, 5, 10$$.

## Moments for limits

From (13) or (14) we have the following lemma.

### Lemma 3.1

Note that $$\mathbf{Y}_{1}$$, $$\mathbf{Y}_{2}$$, and $$\mathbf{Y}_{3}$$ of (15) can be represented in terms of each other by $$Y_{2, i} = \exp ( Y_{1, i} / \alpha )$$ and $$Y_{3, i} = -Y_{2, i}^{-1} = -\exp ( -Y_{1, i} /\alpha )$$, $$1 \leq i \leq r$$.

### Theorem 3.1

Set $$Z_{i} = -\log G_{1} ( Y_{1, i} )$$. For t in $${\mathbb{C}}^{r}$$,

\begin{aligned} \mathbb{E} \bigl[ \exp \bigl( \mathbf{t}^{\prime} \mathbf{Y}_{1} \bigr) \bigr] = \mathbb{E} \Biggl[ \prod_{i=1}^{r} Z_{i}^{-t_{i}} \Biggr] = B_{r} ( \mathbf{t} ), \end{aligned}
(16)

where $$B_{r}(\mathbf{t}) = \Gamma (r-T_{r} ) / \prod_{i=1}^{r-1} (i-T_{i} )$$ and $$T_{i} = \sum_{j=1}^{i} t_{j}$$ for $$i-\Re (T_{i} ) > 0$$, $$1 \leq i \leq r$$.

### Proof

By (13), Z has joint probability density function $$\exp (-z_{r} )$$ on $$0 < z_{1} < \cdots< z_{r} < \infty$$. Now use induction. □

Applying $$( \partial/ \partial t )^{j}$$ to (16) and setting $$t=0$$ gives an expression for $$m_{\mathbf{j}} (\mathbf{Y}_{1} ) = \mathbb{E} [ \prod_{i=1}^{r} Y_{1, i}^{j_{i}} ]$$. However, these are more easily found from the cumulants: taking logs of (16) yields the following remarkable result.

### Corollary 3.1

For $$j_{r} > 0$$ and $$\mathbf{j.} = \sum_{i=1}^{r} j_{i}$$,

\begin{aligned} \kappa_{\mathbf{j}} ( \mathbf{Y}_{1} ) = (-1)^{\mathbf{j.}} \psi^{ ( \mathbf{j.} - 1 )} (r). \end{aligned}
(17)

For example, for $$1 \leq s \leq r$$, $$\operatorname{covar} ( Y_{s, 1}, Y_{r, 1} ) = \psi^{(1)} (r)$$, so $$0 \leq \operatorname{var} [ Y_{s, 1} - Y_{r, 1} ] = \operatorname{var} [ Y_{s, 1} ] - \operatorname{var} [ Y_{r, 1} ]$$ and $$\operatorname{corr} ( Y_{s, 1}, Y_{r, 1} ) = \{ \psi^{(1)} (r) / \psi ^{(1)} (s) \}^{1/2}$$. By (6.3.6) of Abramowitz and Stegun , the right hand side of (17) is equal to $$-\sum_{i=1}^{r-1} i^{-\mathbf{j.}} + \gamma_{\mathbf{j.}}$$, where $$\gamma_{\mathbf{j.}} = (-1)^{\mathbf{j.}} \psi^{\mathbf{j.} - 1} (1)$$. Also $$\gamma_{1} = -\psi(1) = \gamma= 0.57721 \cdots$$ and $$\gamma_{2} = \pi^{2} /6$$.

Lemma 3.1 and Theorem 3.1 imply the following result.

### Theorem 3.2

We have $$m_{\mathbf{j}} ( \mathbf{Y}_{2} ) = B_{r} (\mathbf{j} / \alpha )$$ and $$m_{\mathbf{j}} ( -\mathbf{Y}_{3} ) = B_{r} ( -\mathbf{j} / \alpha )$$. In fact, this holds with j replaced by $$\mathbf{t} \in\ {\mathbb{C}}^{r}$$. So, for $$M_{r}(t)$$ of (2) and $$V_{r}(t) = M_{r} (2t) - M_{r} (t)^{2}$$,

\begin{aligned} \mathbb{E} \bigl[ Y_{2, r}^{t} \bigr] = M_{r} (-t/ \alpha), \qquad\mathbb{E} \bigl[ ( -Y_{3, r} )^{t} \bigr] = M_{r} ( t / \alpha ), \end{aligned}

and

\begin{aligned} \operatorname{var} [ Y_{2, r} ] = V_{r} \bigl(- \alpha^{-1} \bigr),\qquad \operatorname{var} [ Y_{3, r} ] = V_{r} \bigl( \alpha^{-1} \bigr). \end{aligned}

Similarly, for $$1 \leq s \leq r$$,

\begin{aligned} \mathbb{E} \bigl[ Y_{2, s}^{j_{s}} Y_{2, r}^{j_{r}} \bigr] = C ( j_{s}/ \alpha, j_{r} / \alpha ),\qquad \mathbb{E} \bigl[ ( -Y_{3, s} )^{j_{s}} ( -Y_{3, r} )^{j_{r}} \bigr] = C (-j_{s} / \alpha, -j_{r} / \alpha ), \end{aligned}

where

\begin{aligned} C ( t_{s}, t_{r} ) = \Gamma ( r-t_{s}-t_{r} )\Big/ \Biggl\{ (s-1)! \prod_{j=s}^{r-1} ( j - t_{s} ) \Biggr\} \end{aligned}

and $$\operatorname{covar} ( Y_{1, s}, Y_{1, r} )$$ is given by (3) for $$I =2, 3$$.

### Example 3.1

Set $$\alpha= 1$$. Then in the notation of (3), $$\mathbb{E} [ Y_{2, r}^{j} ] = 1/ \langle r - 1 \rangle_{j}$$, $$\mathbb{E} [ ( -Y_{3, r} )^{j} ] = (r)_{j}$$, $$\operatorname{var} [ Y_{2, r} ] = (r-1)^{-2} (r-2)^{-1}$$ for $$3 \leq r$$, $$\operatorname{var} [ Y_{3, r} ] = r$$, $$\operatorname{covar} ( Y_{2, s}, Y_{2, r} ) = c_{s, r} (1) = \{ (s-1)(r-1)(r-2) \}^{-1}$$ for $$2 \leq s \leq r$$, $$3 \leq r$$ and $$\operatorname{covar} ( Y_{3, s}, Y_{3, r} ) = c_{s, r} (-1) = s$$ for $$1 \leq s \leq r$$, with corresponding correlations $$\rho_{s, r}(1) = \{ (s-2) / (r-2) \}^{1/2}$$ and $$\rho_{s, r} (-1) = (s/r)^{1/2}$$.

### Example 3.2

Set $$\alpha= 1/2$$. Then $$\mathbb{E} [Y_{2, r} ] = 1 / \langle r-1 \rangle_{2}$$ for $$r \geq2$$, $$\mathbb{E} [ Y_{3, r} ] = -(r)_{2}$$ for $$r \geq0$$, $$\operatorname{var} [ Y_{2, r} ] = 2 \langle r-1 \rangle_{4}^{-1} (2r - 5)$$ for $$r \geq4$$, and $$\operatorname{var} [ Y_{3, r} ] = 2 (r)_{2} (2r + 5)$$ for $$r \geq0$$.

By (6.1.47) of Abramowitz and Stegun , as $$r \rightarrow \infty$$,

\begin{aligned} M_{r} (t) = r^{t} \bigl\{ 1+\langle t \rangle_{2} r^{-1} /2 +\langle t \rangle_{3} (3t-1)r^{-2} / 24 + O \bigl(r^{-3} \bigr) \bigr\} , \end{aligned}

so $$V_{r} (t) = 2r^{2t-1} \{ t^{2} + O (r^{-1} ) \}$$, $$\mathbb{E} [ Y_{2, r} ] = r^{-1/\alpha} \{ 1 + (\alpha^{-1} )_{2} + O (r^{-2} ) \}$$, $$\mathbb{E} [ Y_{3, r} ] = -r^{1/\alpha} \{ 1+\langle \alpha^{-1} \rangle_{2} r^{-1} / 2 + O (r^{-2} ) \}$$, $$\operatorname{var} [ Y_{2, r} ] = 2r^{2/ \alpha-1} \{\alpha^{-2} + O (r^{-1} ) \}$$ and similarly for $$\mathbb{E} [ Y_{2, r}^{j} ]$$ and $$\mathbb{E} [Y_{3, r}^{j} ]$$.

## Approximations for $$\mathbf{M}_{n}$$

Suppose

\begin{aligned} \mathbf{Y}_{n} = ( \mathbf{M}_{n} - b_{n} { \mathbf{1}}_{r} ) c_{n} \stackrel{\mathcal{L}}{\rightarrow} \mathbf{Y} \end{aligned}
(18)

as $$n \rightarrow\infty$$. Then we can represent $$\mathbf{Y}_{n}$$ as $$\mathbf{Y} + o_{p} (1)$$, that is,

\begin{aligned} \mathbf{M}_{n} = b_{n} {\mathbf{1}}_{r} + c_{n} \bigl( \mathbf{Y} +o_{p} (1) \bigr). \end{aligned}
(19)

By Theorem 5.4 of Billingsley , as $$n \rightarrow\infty$$, $$\mathbb{E} [ g (\mathbf{Y}_{n} ) ] \stackrel{\mathcal {L}}{\rightarrow} \mathbb{E} [ g (\mathbf{Y} ) ]$$ if $$g ( \mathbf{Y}_{n} )$$ is uniformly integrable, for example, if $$\sup_{n} \mathbb{E} [ \vert g ( \mathbf{Y}_{n} ) \vert ^{1 + \epsilon} ] < \infty$$ for some $$\varepsilon> 0$$. So, if $$\sup_{n} \mathbb{E} [ \vert \mathbf{Y}_{n}^{\mathbf{k}} \vert ] < \infty$$ for some $$\mathbf{k} > \mathbf{j}$$ then

\begin{aligned} \mathbb{E} \bigl[ \mathbf{Y}_{n}^{\mathbf{j}} \bigr] = c_{n}^{-\mathbf{j.}} \mathbb{E} \bigl[ ( \mathbf{M}_{n} - b_{n} {\mathbf{1}}_{r} )^{\mathbf{j}} \bigr] \rightarrow \mathbb{E} \bigl[ \mathbf{Y}^{\mathbf{j}} \bigr] \end{aligned}

and

\begin{aligned} \mu_{\mathbf{j}} ( \mathbf{Y}_{n} ) = c_{n}^{-\mathbf{j.}} \mu_{\mathbf{j}} ( \mathbf{M}_{n} ) \rightarrow\mu_{\mathbf{j}} ( \mathbf{Y} ). \end{aligned}
(20)

Here, $$\mathbf{k} > \mathbf{j}$$ means $$k_{i} > j_{i}$$ for $$1 \leq i \leq r$$. If

\begin{aligned} b_{n} = B_{n} ( 1+\delta_{n} ), \quad\mbox{where } \delta_{n} \rightarrow0 \mbox{ and } C_{n}= c_{n}/B_{n} \rightarrow0 \end{aligned}
(21)

then (19) implies $$\mathbf{M}^{\mathbf{j}}_{n} = b_{n, \mathbf{j.}} + c_{n, \mathbf{j.}} ( \mathbf{j}^{\prime} \mathbf{Y} + o_{p} (1) )$$, where $$b_{n, \mathbf{j.}} = B_{n}^{\mathbf{j.}} ( 1 + \mathbf{j.} \delta_{n} )$$ and $$c_{n, \mathbf{j.}} = B_{n}^{\mathbf{j.}} C_{n}$$, so

\begin{aligned} \mathbb{E} \bigl[ \mathbf{M}_{n}^{\mathbf{j}} \bigr] = b_{n, \mathbf{j.}} + c_{n, \mathbf{j.}} \bigl( \mathbf{j}^{\prime} \mathbb{E} [ { \mathbf{Y}} ] + o (1) \bigr) \end{aligned}

provided $$( \mathbf{M}_{n}^{\mathbf{j.}} - b_{n, \mathbf{j.}} ) / c_{n, \mathbf{j.}}$$ is uniformly integrable. McCord  provided methods of proving uniform integrability of such functions of $$Y_{n}$$ for the case $$r=1$$. We shall not do so here. It is easy to show that there exists $$c_{n, r} < \infty$$ such that

\begin{aligned} \mathbb{E} \bigl[ \bigl\vert \mathbf{M}^{\mathbf{j}}_{n} \bigr\vert \bigr] \leq c_{n, r} \prod_{i=1}^{r} \mathbb{E} \bigl[ \vert X \vert ^{j_{i}} \bigr], \end{aligned}

where $$X \sim F (x)$$. For example, the reader can try this on (14.2) of Stuart and Ord . It follows that $$\mathbb{E} [ \mathbf{M}_{n}^{\mathbf{j}} ]$$ exists if and only if $$\mathbb{E} [ X^{j_{0}} ]$$ exists, that is,

\begin{aligned} \mathbb{E} \bigl[ \vert X \vert ^{j_{0}} \bigr] < \infty, \end{aligned}
(22)

where $$j_{0} = \max \{ j_{i} : 1 \leq i \leq r \}$$. We conjecture that this condition is sufficient for $$\mathbf{Y}_{n}^{\mathbf{j}}$$ to be uniformly integrable. We shall see that McCord  proved this conjecture when $$r=1$$ for several large classes of F. Table 1 gave $$Y_{I}$$, $$b_{n}$$, $$c_{n}$$ for a number of choices of $$F(x)$$. In fact, each of these $$F(x)$$ may be replaced by the class of $$F(x)$$ with the same asymptotic tail behavior as $$x \rightarrow\infty$$. We now illustrate this and expand these classes for some important cases. Set $$b_{n} (X) = b_{n}$$, $$c_{n} (X)=c_{n}$$, $$M_{n}(X)=M_{n}$$, and $$Y_{n} (X)= Y_{n}$$.

### Theorem 4.1

Suppose

\begin{aligned} 1 - F(x) \approx(a/x)^{\alpha} \end{aligned}
(23)

as $$x \rightarrow\infty$$, where $$a > 0$$ and $$\alpha> 0$$. Then (18) holds with $$b_{n} = 0$$, $$c_{n}=an^{1/\alpha}$$, and $$Y = Y_{2}$$.

### Proof

Note that (8) holds with $$u_{n} =c_{n}x$$, $$t=x^{-\alpha}$$, so (14) gives (18). □

Note that (23) holds if

\begin{aligned} F \mbox{ has probability density function} \approx ( a/x )^{\alpha+1} a^{-1} \end{aligned}
(24)

as $$x \rightarrow\infty$$. This includes the Pareto, Cauchy and $$G_{2}$$ distributions.

For $$r=1$$, when (24) holds, Theorem 5.2 of McCord  proved

\begin{aligned} \mathbf{Y}_{n}^{\mathbf{j}} \mbox{ is uniformly integrable if (22) holds}. \end{aligned}
(25)

We conjecture this is true for all r.

### Theorem 4.2

Suppose $$X \sim F$$ on $$( -\infty, x_{0} )$$ with $$x_{0} < \infty$$ and

\begin{aligned} 1-F(x) \approx c ( x_{0} - x )^{\alpha} \end{aligned}
(26)

as $$x \uparrow x_{0}$$, where $$c > 0$$ and $$\alpha> 0$$. Then (18) holds with $$Y = Y_{3}$$, $$b_{n} = x_{0}$$, and $$c_{n} = (cn)^{-1/\alpha}$$.

### Proof

Note that $$u_{n} = b_{n} +c_{n} x$$ satisfies (8) with $$t = (-x)^{\alpha}$$, so (14) gives (18). □

Note that (26) holds if F has probability density function

\begin{aligned} f(x) \approx\alpha c ( x_{0} - x )^{\alpha-1} \end{aligned}
(27)

as $$x \uparrow x_{0}$$.

For $$r=1$$ Theorem 1 in McCord  proved (25) when (27) holds. We conjecture it is true for all r.

Note that (26) holds for $$F = G_{3}$$ with $$c = 1$$ and $$x_{0} = 0$$.

### Theorem 4.3

Suppose as $$x \rightarrow\infty$$,

\begin{aligned} 1 - F(x) \approx k x^{d} \exp(-x). \end{aligned}
(28)

Then (18) holds with $$Y = Y_{1}$$, $$c_{n} = 1$$, and $$b_{n}=n_{1} + dn_{2} +k_{1}$$, where $$k_{1} = \log k$$.

### Proof

Note that (8) holds with $$u_{n} = b_{n} + x$$ and $$t = \exp (-x)$$, so (14) gives (18). □

Note that (28) holds if F has probability density function $$f (x) \approx k x^{d} \exp(-x)$$ as $$x \rightarrow\infty$$. So, with $$k=1$$ this covers the exponential ($$d=0$$), logistic ($$d=0$$) and gamma $$(\gamma)$$ ($$d = \gamma- 1$$) distributions.

### Theorem 4.4

Suppose as $$z \rightarrow\infty$$,

\begin{aligned} 1-F(x) \approx k z^{d} \exp(-z) \end{aligned}
(29)

for $$z = \{ (x - b) /c \}^{a}$$. So, either $$a > 0$$ and $$c > 0$$, or $$c < 0$$ and a is a negative odd integer. Then (18) holds with $$Y=Y_{1}$$, $$c_{n} = c a^{-1} n_{1}^{1/a-1}$$, and $$b_{n} = b I (a < 0) +cn_{1}^{1/a} \{ 1+a^{-1} n_{1}^{-1} (dn_{2} + k_{1} ) \}$$, where $$I(A) = 1$$ if A is true, or $$I (A) = 0$$ if A is false.

### Proof

The cumulative distribution function of $$Z = \{ (X - b) / c \} ^{a}$$ satisfies (28). So, $$M_{n, i} = b +cM_{n, i} (Z)^{1/a}$$. Now apply Theorem 4.3. □

If $$a > 0$$ and $$c > 0$$, then $$z \rightarrow\infty$$ if and only if $$x \rightarrow\infty$$. But if $$a < 0$$ and $$c < 0$$, then $$X < b$$ with probability 1 and $$z \rightarrow\infty$$ if and only if $$x \uparrow b$$.

Note that (29) holds if F has probability density function

\begin{aligned} f(x) \approx kz^{d} \exp(-z) \partial z / \partial x = k a c^{-1} z^{d+1-1/a} \exp(-z) \end{aligned}
(30)

as $$z \rightarrow\infty$$. For $$F = \Phi$$, the unit normal cumulative distribution function, this holds with $$a=2$$, $$b=0$$, $$c=2^{1/2}$$, $$d= -1/2$$, $$k+(4 \pi)^{-1/2}$$, giving $$b_{n}$$, $$c_{n}$$ of (9). Another example is $$1 - F(x) \approx\exp(1/x)$$: in this case $$k=1$$, $$d=b=0$$, and $$a = c = -1$$.

For $$r =1$$, $$d=0$$, $$a > 0$$, and (30), Theorem 3 in McCord  proved (25). We conjecture it is true for all r. For this case, he conjectured (20) for $$j=2$$. So, his conjecture is true if $$\mathbb{E} [ X^{2} ] < \infty$$.

If $$a > 0$$ or $$b = 0$$ then (21) holds with

\begin{aligned} B_{n} = cn_{1}^{1/a} \quad\mbox{and}\quad \delta_{n} = a^{-1} n_{1}^{-1} ( dn_{2} + k_{1} ), \end{aligned}
(31)

so $$b_{n, \mathbf{j.}} = c^{\mathbf{j.}} n_{1}^{\mathbf{j.}/a} ( 1 + \mathbf{j.} \delta_{n} )$$ and $$c_{n, \mathbf{j.}} = a^{-1} c^{\mathbf{j.}} n_{1}^{\mathbf{j.}/a-1}$$.

### Theorem 4.5

Suppose $$Z = \exp(X)$$ for X of Theorem  4.2. If $$a > 1$$, (18) holds for $$Y_{n} (Z)$$ with $$Y = Y_{1}$$, $$b_{n} (Z) = \exp ( B_{n} ) ( 1+B_{n} \delta_{n} )$$ and $$c_{n} (Z) = \exp ( B_{n} ) c_{n}$$ for $$B_{n}$$, $$\delta _{n}$$ of (31). If $$a = 1$$, (18) holds for $$Y_{n} (Z)$$ with $$Y=Y_{2}$$ at $$\alpha = c^{-1}4$$, $$b_{n} (Z) = 0$$, $$c_{n} (Z) = \exp ( b_{n} )$$, and $$b_{n} = c ( n_{1} + dn_{2} + k_{1} )$$.

### Proof

Note that $$M_{n, i} (Z) = \exp ( M_{n, i} ) = \exp ( b_{n} + c_{n} Y_{n, i} )$$. By (21), (31), if $$a > 1$$, $$B_{n} \delta_{n} \rightarrow 0$$ and $$c_{n} \rightarrow0$$, so $$M_{n, i} (Z) = \exp ( B_{n} ) ( 1+B_{n} \delta_{n} + c_{n} Y_{1, i} + o_{p} (c_{n} ) )$$. If $$a=1$$, $$c_{n} = c$$, so $$\exp ( c_{n} Y_{n, 1} ) = \exp ( c Y_{1} ) (1 + o_{p}(1) ) = Y_{2} + o_{p} (1)$$ at $$\alpha= c^{-1}$$. □

For the log-normal distribution this gives $$Y = Y_{1}$$, $$B_{n} = c_{n}^{-1} = ( 2n_{1} )^{1/2}$$, $$\delta_{n} = - ( n_{2} + \log4 \pi ) / ( 4 n_{1} )$$, so $$b_{n} (Z) = \exp ( B_{n} ) \{ 1 - (2n_{1} )^{-1/2} (n_{2} + \log4 \pi ) / 2 \}$$ and $$c_{n} (Z) = \exp ( B_{n} ) / \sqrt{2n_{1}}$$.

## References

1. 1.

Leipala, T: Solutions of stochastic traveling salesman problems. Eur. J. Oper. Res. 2, 291-297 (1978)

2. 2.

Ramachandran, G: Properties of extreme order statistics and their application to fire protection and insurance problems. Fire Saf. J. 5, 59-76 (1982)

3. 3.

Revfeim, KJA, Hessell, JWD: More realistic distributions for extreme wind gusts. Q. J. R. Meteorol. Soc. 110, 505-514 (1984)

4. 4.

van de Graaf, JW, Tromans, PS, Vanderschuren, L, Jukui, BH: Failure probability of a jack-up under environmental loading in the central North Sea. Mar. Struct. 9, 3-24 (1996)

5. 5.

Louchard, G, Prodinger, H: Asymptotics of the moments of extreme-value related distribution functions. Algorithmica 46, 431-467 (2006)

6. 6.

McCord, JR: On asymptotic moments of extreme statistics. Ann. Math. Stat. 35, 1738-1743 (1964)

7. 7.

Fisher, RA, Tippett, LHC: Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proc. Camb. Philos. Soc. 24, 180-190 (1928)

8. 8.

Billingsley, P: Convergence of Probability Measures. Wiley, New York (1968)

9. 9.

Gumbel, EJ: Statistics of Extremes. Columbia University Press, New York (1958)

10. 10.

Kotz, S, Johnson, NL: Encyclopaedia of Statistical Sciences, vol. 2. Wiley, New York (1982)

11. 11.

Hosking, JRM, Wallis, JR, Wood, EF: Estimation of the generalized extreme value distribution by the method of probability weighted moments. Technometrics 27, 251-261 (1985)

12. 12.

Gnedenko, BV: Sur la distribution limite du terme maximum d’une série aléatoire. Ann. Math. 44, 423-453 (1943)

13. 13.

Leadbetter, MR, Lindgren, G, Rootzen, H: Extremes and Related Properties of Random Sequences and Processes. Springer, New York (1983)

14. 14.

Stuart, A, Ord, JK: Kendall’s Advanced Theory of Statistics, vol. 1, 5th edn. Griffin, London (1987)

15. 15.

Balkema, AA, de Haan, L: A convergence rate in extreme value theory. J. Appl. Probab. 27, 577-585 (1990)

16. 16.

Dekkers, ALM, de Haan, L: On the estimation of the extreme value index and large quantile estimation. Ann. Stat. 17, 1795-1832 (1989)

17. 17.

Dekkers, ALM, Einmahl, JHJ, de Haan, L: A moment estimator for the index of an extreme value distribution. Ann. Stat. 17, 1833-1855 (1989)

18. 18.

Tawn, JA: An extreme value theory model for dependent observations. J. Hydrol. 101, 227-250 (1988)

19. 19.

Smith, RL: Extreme value theory based on the r largest annual events. J. Hydrol. 86, 27-43 (1986)

20. 20.

Omey, E, Rachev, ST: Rates of convergence in multivariate extreme value theory. J. Multivar. Anal. 38, 36-50 (1991)

21. 21.

Abramowitz, M, Stegun, IA: Handbook of Mathematical Functions. Natl. Bur. of Standards, Washington (1964)

## Acknowledgements

The authors would like to thank the two referees for careful reading and comments, which improved the paper.

## Author information

Authors

### Corresponding author

Correspondence to Christopher S Withers.

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

CW derived the results for Sections 1, 2 and 3. SN derived the results for Section 4. All authors read and approved the final manuscript.

## Rights and permissions 