Open Access

A class of small deviation theorems for the random fields on an m rooted Cayley tree

Journal of Inequalities and Applications20122012:1

https://doi.org/10.1186/1029-242X-2012-1

Received: 21 March 2011

Accepted: 4 January 2012

Published: 4 January 2012

Abstract

In this paper, we are to establish a class of strong deviation theorems for the random fields relative to m th-order nonhomogeneous Markov chains indexed by an m rooted Cayley tree. As corollaries, we obtain the strong law of large numbers and Shannon-McMillan theorem for m th-order nonhomogeneous Markov chains indexed by that tree.

2000 Mathematics Subject Classification: 60F15; 60J10.

Keywords

strong deviation theorem m rooted Cayley tree m th-order nonhomogeneous Markov chain Shannon-McMillan theorem

1. Introduction

A tree is a graph G = {T, E} which is connected and contains no circuits. Given any two vertices σ, t(σt T), let σ t ¯ be the unique path connecting σ and t. Define the graph distance d (σ, t) to be the number of edges contained in the path σ t ¯ .

Let T C,N be a Cayley tree. In this tree, the root (denoted by o) has only N neighbors and all other vertices have N + 1 neighbors. Let T B, N be a Bethe tree, on which each vertex has N + 1 neighboring vertices. Here both T C,N and T B,N are homogeneous tree. In this paper, we mainly consider an m rooted Cayley tree T ¯ C , N (see Figure 1). It is formed by a Cayley tree T C,N with the root o connecting with another vertex denoted by the the root -1, and then root -1 connecting with another vertex denoted by the root -2, and continuing to do the same work until the last vertex denoted by the root - (m - 1) is connected. When the context permits, this type of tree is denoted simply by T.
Figure 1

An m rooted Cayley tree T ̄ C , 2 .

Let σ, t(σ, to, -1, - 2,..., - (m - 1)) be vertices of an m rooted Cayler tree T. Write tσ if t is on the unique path connecting o to σ, and |σ | the number of edges on this path. For any two vertices σ, t(σ, to, -1, - 2,..., - (m - 1)) of tree T, denote by σ t the vertex farthest from o satisfying σ tσ and σ tt.

The set of all vertices with distance n from the root o is called the n-th generation of T, which is denoted by L n . We say that L n is the set of all vertices on level n and especially root -1 is on the -1st level on tree T, root -2 is on the -2nd level. By analogy, root -(m - 1) is on the -(m - 1) th level. We denote by T(n)the subtree of an m rooted Cayley tree T containing the vertices from level -(m - 1) (the root -(m - 1)) to level n. Let t(to, -1, -2, ..., -(m - 1)) be a vertex of an m rooted Cayley tree T. Predecessor of the vertex t is another vertex, which is nearest from t, on the unique path from root -(m - 1) to t. We denote the predecessor of t by 1 t , the predecessor of 1 t by 2 t and the predecessor of (n - 1) t by n t . We also say that n t is the n-th predecessor of t. X A = {X t , t A} is a stochastic process indexed by a set A, and denoted by |A| the number of vertices of A, x A is the realization of X A .

Let ( Ω , F ) be a measure space, {X t , tT} be a collection of random variables defined on ( Ω , F ) and taking values in G = {0,1,..., b - 1}, where b is a positive integer. Let P be a general probability distribution on ( Ω , F ) . We will call P the random field on tree T. Denote the distribution of {X t , t T} under the probability measure P by
P ( x T ( n ) ) = P ( X T ( n ) = x T ( n ) ) , x T ( n ) G T ( n ) .
(1)
Let
f n ( ω ) = - 1 | T ( n ) | ln P ( X T ( n ) ) .
(2)

f n (ω) is called entropy density of X T ( n ) .

Let Q be another probability measure on the measurable space ( Ω , F ) , and let the distribution of {X t , t T} under Q be
Q ( x T ( n ) ) = Q ( X T ( n ) = x T ( n ) ) , x T ( n ) G T ( n ) .
(3)
Let
h ( P | Q ) = lim sup n 1 | T ( n ) | ln P ( X T ( n ) ) Q ( X T ( n ) ) .
(4)

h(P | Q) is called the sample divergence rate of P relative to Q.

Remark 1 If P = Q, h(P | Q) = 0 holds. By using the approach of Lemma 1 of Liu and Wang [1], we also can prove that h(P | Q) ≥ 0, P - a.e.; hence, h(P | Q) can be regarded as a measure of the Markov approximation of the arbitrary random field on T.

Definition 1 (see [2]) Let G = {0, 1,..., b - 1} and P(y|x1, x2,..., x m ) be a nonnegative functions on G m +1. Let
P = ( P ( y | x 1 , x 2 , , x m ) ) , P ( y | x 1 , x 2 , , x m ) 0 , x 1 , x 2 , , x m , y G .
If
y G P ( y | x 1 , x 2 , , x m ) = 1 ,

then P is called an m-order transition matrix.

Definition 2 (see [2]). Let T be an m rooted Cayley tree, and let G = {0, 1,..., b - 1} be a finite state space, {X t , t T} be a collection of G-valued random variables defined on the probability space ( Ω , F , Q ) . Let Q be a probability on a measurable space ( Ω , F ) .

Let
q = ( q ( x 1 , x 2 , , x m ) ) , x 1 , x 2 , , x m G
(5)
be a distribution on G m , and
Q n = ( q n ( y | x 1 , x 2 , , x m ) ) , x 1 , x 2 , , x m , y G , n 1
(6)
be m-order transition matrices. For any vertex t L n , n ≥ 1, if
Q ( X t = y | X 1 t = x 1 , X 2 t = x 2 , , X m t = x m  and  X σ  for σ t 1 t ) = Q ( X t = y | X 1 t = x 1 , X 2 t = x 2 , , X m t = x m ) = q n ( y | x 1 , x 2 , , x m ) , x 1 , x 2 , , x m , y G
(7)
and
Q ( X - ( m - 1 ) = x 1 , , X - 1 = x m - 1 , X o = x m ) = q ( x 1 , , x m - 1 , x m ) , x 1 , , x m G ,
(8)

then {X t , t T} is called a G-valued m th-order nonhomogeneous Markov chain indexed by an m rooted Cayley tree with the initial m dimensional distribution (5) and m-order transition matrices (6) under the probability measure Q, or called a T-indexed m th-order nonhomogeneous Markov chain under the probability measure Q.

We denote
o m = { o , - 1 , - 2 , , - ( m - 1 ) } , o m = { - 1 , - 2 , , - ( m - 1 ) } ,
X 1 n ( t ) = { X n t , , X 2 t , X 1 t } , X 0 n ( t ) = { X n t , , X 2 t , X 1 t , X t } ,

and denote by x 1 n ( t ) and x 0 n ( t ) the realizations X 1 n ( t ) and X 0 n ( t ) , respectively.

Let {X t , t T} be an m th-order nonhomogeneous Markov chains indexed by an m rooted Cayley tree T under the probability measure Q defined on above. It is easy to see that
Q ( x T ( n ) ) = Q ( X T ( n ) = x T ( n ) ) = q ( x - ( m - 1 ) , , x o ) k = 1 n t L k q k ( x t | x 1 m ( t ) ) .
(9)

In the following, we always assume that P(x T ( n )), Q(x T ( n )), q(x1,..., x m ), and {q n (y | x1,..., x m ), n ≥ 1} are all positive.

There have been some works on limit theorems for tree-indexed stochastic process. Benjamini and Peres [3] have given the notion of the tree-indexed Markov chains and studied the recurrence and ray-recurrence for them. Berger and Ye [4] have studied the existence of entropy rate for some stationary random fields on a homogeneous tree. Pemantle [5] proved a mixing property and a weak law of large numbers for a PPG-invariant and ergodic random field on a homogeneous tree. Ye and Berger [6, 7], by using Pemantle's result and a combinatorial approach, have studied the Shannon-McMillan theorem with convergence in probability for a PPG-invariant and ergodic random field on a homogeneous tree. Yang and Liu [8] have studied a strong law of large numbers for the frequency of occurrence of states for Markov chains field on a Bethe tree (a particular case of tree-indexed Markov chains field and PPG-invariant random field). Yang [9] has studied the strong law of large numbers for frequency of occurrence of state and Shannon-McMillan theorem for homogeneous Markov chains indexed by a homogeneous tree. Yang and Ye [10] have studied the strong law of large numbers and Shannon-McMillan theorem for nonhomogeneous Markov chains indexed by a homogeneous tree. Huang and Yang [11] have studied the strong law of large numbers and Shannon-McMillan theorem for Markov chains indexed by an infinite tree with uniformly bounded degree. Recently, Shi and Yang [12] have also studied some limit properties of random transition probability for second-order nonhomogeneous Markov chains indexed by a tree. Peng et al. [13] have studied a class of strong deviation theorems for the random fields relative to homogeneous Markov chains indexed by a homogeneous tree. Shi and Yang [2] have studied the strong law of large numbers and Shannon-McMillan for the m th-order nonhomogeneous Markov chains indexed by an m rooted Cayley tree. Yang [14] has also studied a class of small deviation theorems for the sequences of N-valued random variables with respect to m th-order nonhomogeneous Markov chains.

In this paper, our main purpose is to extend Yang's [14] result to an m rooted Cayley tree. By introducing the sample divergence rate of any probability measure with respect to m th-order nonhomogeneous Markov measure on an m rooted Cayley tree, we establish a class of strong deviation theorems for the arbitrary random fields indexed by that tree with respect to m th-order nonhomogeneous Markov chains indexed by that tree. As corollaries, we obtain the strong law of large numbers and Shannon-McMillan theorem for m th-order nonhomogeneous Markov chains indexed by that tree.

2. Main Results

Before giving the main results, we begin with a lemma.

Lemma 1 Let T be an m rooted Cayley tree, G = {0, 1,..., b - 1} be the finite state space. Let {X t , t T} be a collection of G-valued random variables defined on the measurable space ( Ω , F ) . Let P and Q be two probability measures on the measurable space ( Ω , F ) , and let {X t , t T} be an m th-order nonhomogeneous Markov chains indexed by tree T under probability measure Q. Let {g n (y1,..., y m +1), n ≥ 1} be a sequence of functions defined on G m +1. Let n = σ ( X T ( n ) ) ( n 1 ) . Set
F n ( ω ) = k = 1 n t L k g k ( X 0 m ( t ) )
(10)
and
t n ( λ , ω ) = e λ F n ( ω ) k = 1 n t L k E Q e λ g k ( X 0 m ( t ) ) | X 1 m ( t ) q ( X - ( m - 1 ) , , X o ) k = 1 n t L k q k ( X t | X 1 m ( t ) ) P ( x T ( n ) ) ,
(11)

where E Q denote the expectation under probability measure Q. Then { t n ( λ , ω ) , F n , n 1 } is a nonnegative martingale under probability measure P.

Proof The proof is similar to Lemma 3 of Peng et al. [12], so the proof is omitted.

Theorem 1 Let T be an m rooted Cayley tree, {X t , t T} be a collection of random variables taking values in G = {0, 1,..., b - 1} defined on the measurable space ( Ω , F ) . Let P and Q be two probability measures on the measurable space ( Ω , F ) , such that {X t , t T} is an m th-order nonhomogeneous Markov chain indexed by T under Q. Let h(P | Q) be defined by (4), {g n (y1,..., y m +1), n ≥ 1} be a sequence of functions defined on G m +1. Let c ≥ 0 be a constant. Set
D ( c ) = { ω : h ( P | Q ) c } .
(12)
Assume that there exists α > 0, such that i m G m ,
b α ( i m ) = lim sup n 1 | T ( n ) | k = 1 n t L k E Q [ e a | g k ( X 0 m ( t ) ) | | X 1 m ( t ) = i m ] τ .
(13)
Let
A t = 2 τ e 2 ( t - α ) 2 ,
(14)
where o < t < a. Thus, when 0 ≤ ct2A t , we have
lim sup n 1 | T ( n ) | | k = 1 n t L k { g k ( X 0 m ( t ) ) E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } | lim sup n 2 c A t , P a . e ., ω D ( c ) .
(15)
In particular,
lim n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } = 0 , P - a . e . , ω D ( 0 ) .
(16)
Proof Let t n (λ, ω) be defined by (11). By Lemma 1, { t n ( λ , ω ) , F n , n 1 } is a non-negative martingale under probability measure P. By Doob's martingale convergence theorem, we have
lim n t n ( λ , ω ) = t ( λ , ω ) < , P - a . e .
Hence,
lim sup n 1 | T ( n ) | ln t n ( λ , ω ) 0 , P - a . e . .
(17)
We have by (9), (10), (11) and (17)
lim sup n 1 | T ( n ) | k = 1 n t L k λ g k ( X 0 m ( t ) ) - ln E Q e λ g k ( X 0 m ( t ) ) | X 1 m ( t ) - ln P ( X T ( n ) ) Q ( X T ( n ) ) 0 , P - a . e .
(18)
By (4),(12) and (18)
lim sup n 1 | T ( n ) | k = 1 n t L k λ g k ( X 0 m ( t ) ) - ln E Q e λ g k ( X 0 m ( t ) ) | X 1 m ( t ) c , P - a . e . , ω D ( c ) .
(19)
This implies that
lim sup n λ | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } lim sup n 1 | T ( n ) | k = 1 n t L k ln E Q e λ g k ( X 0 m ( t ) ) | X 1 m ( t ) - E Q [ λ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] + c , P - a . e . , ω D ( c )
(20)
Let |λ| < t. By inequalities In xx -1(x > 0) and e x - 1 - x x 2 2 e | x | , and noticing that
max { x 2 e - h x , x 0 } = 4 e - 2 / h 2 ( h > 0 ) .
(21)
We have
lim sup n 1 | T ( n ) | k = 1 n t L k { ln E Q [ e λ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] E Q [ λ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } lim sup n 1 | T ( n ) | k = 1 n t L k { E Q [ e λ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] 1 E Q [ λ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } λ 2 2 lim sup n 1 | T ( n ) | k = 1 n t L k E Q [ g k 2 ( X 0 m ( t ) ) e | λ | | g k ( X 0 m ( t ) ) | | X 1 m ( t ) ] = λ 2 2 lim sup n 1 | T ( n ) | k = 1 n t L k E Q [ e α | g k ( X 0 m ( t ) ) | g k 2 ( X 0 m ( t ) ) e ( | λ | α ) | g k ( X 0 m ( t ) ) | | X 1 m ( t ) ] λ 2 2 lim sup n 1 | T ( n ) | k = 1 n t L k E Q [ e α | g k ( X 0 m ( t ) ) | 4 e 2 / ( | λ | a ) 2 | X 1 m ( t ) ] 2 λ 2 τ / e 2 ( t α ) 2 .
(22)
By (20) and (22), we have
lim sup n λ | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } λ 2 A t + c , P - a . e . , ω D ( c ) .
(23)
When 0 < λ < t < α, we have by (23)
lim sup n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } λ A t + c / λ , P - a . e . , ω D ( c ) .
(24)
It is easy to see that when 0 < c < t2A t , the function f (λ) = λA t + c/λ attains, at λ = c / A t , its smallest value f ( c / A t ) = 2 c A t . Letting λ = c / A t in (24), we have
lim sup n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } 2 c A t , P - a . e . , ω D ( c ) .
(25)
When c = 0, we have by (24)
lim sup n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } λ A t , P - a . e . , ω D ( 0 ) .
(26)
Letting λ → 0+ in (26), we obtain
lim sup n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } 0 , P - a . e . , ω D ( 0 ) .
(27)
Hence, (25) also holds for c = 0. When -α < -t < λ < 0, by virtue of (23) it can be shown in a similar way that
lim inf n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } - 2 c A t , P - a . e . , ω D ( c ) .
(28)

Equation 15 follows from (25) and (28), Equation 15 implies (16) immediately. This completes the proof of the theorem. □

Theorem 2 Let
H t = 2 b / e 2 ( t - 1 ) 2 , 0 < t < 1 .
(29)
Let f n (ω) be defined by (2). Under the conditions of Theorem 1, when 0 ≤ ct2H t , we have
lim sup n { f n ( ω ) - 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] } 2 c H t , P - a . e . , ω D ( c ) ,
(30)
lim inf n { f n ( ω ) - 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] } - 2 c H t - c , P - a . e . , ω D ( c ) ,
(31)
where H(p0,.... p b -1) denote the entropy of distribution (p0,..., p b -1), i.e.,
H ( p 0 , , p b - 1 ) = - i = 0 b - 1 p i ln p i .
Proof In Theorem 1, let g k (y1,..., y m +1) = - In q k (y m +1 | y1,..., y m ) and α = 1, we have
E Q e g k ( X 0 m ( t ) ) | X 1 m ( t ) = i m = j G e | - ln q k ( j | i m ) | q k ( j | i m ) = j G q k ( j | i m ) / q k ( j | i m ) = b .
(32)
Hence, i m G m ,
b 1 ( i m ) = lim sup n 1 | T ( n ) | k = 1 n t L k E Q e g k ( X 0 m ( t ) ) | X 1 m ( t ) = i m b .
(33)
Noticing that
E Q [ - ln q k ( X t | X 1 m ( t ) ) | X 1 m ( t ) ] = - j G q k ( j | X 1 m ( t ) ) ln q k ( j | X 1 m ( t ) ) = H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] .
(34)
When 0 ≤ ct2H t , we have by (34),(29) and (15)
lim sup n 1 | T ( n ) | k = 1 n t L k ( - ln q k ( X t | X 1 m ( t ) ) ) - 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] 2 c H t , P - a . e . , ω D ( c ) .
(35)
lim inf n 1 | T ( n ) | k = 1 n t L k ( - ln q k ( X t | X 1 m ( t ) ) ) - 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] - 2 c H t , P - a . e . , ω D ( c ) .
(36)
By (35), (9) and h(P|Q) ≥ 0,
lim sup n { f n ( ω ) 1 | T ( n ) | | k = 1 n | t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b 1 | X 1 m ( t ) ) ] lim sup n { 1 | T ( n ) | ln P ( X T ( n ) ) 1 | T ( n ) | | k = 1 n | t L k ( ln q k ( X t | X 1 m ( t ) ) } + lim sup n { 1 | T ( n ) | | k = 1 n | t L k ( ln q k ( X t | X 1 m ( t ) ) 1 | T ( n ) | | k = 1 n | t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b 1 | X 1 m ( t ) ) ] } 2 c H t , P a . e ., ω D ( c ) .
(37)
By (36), (9) and (12), we have
lim sup n { f n ( ω ) 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b 1 | X 1 m ( t ) ) ] lim sup n { 1 | T ( n ) | ln P ( X T ( n ) ) 1 | T ( n ) | k = 1 n t L k ( ln q k ( X t | X 1 m ( t ) ) } + lim sup n { 1 | T ( n ) | k = 1 n t L k ( ln q k ( X t | X 1 m ( t ) ) 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b 1 | X 1 m ( t ) ) ] } 2 c H t , P a . e ., ω D ( c ) .
(38)

This completes the proof of this theorem. □

Corollary 1 Under the conditions of Theorem 2, we have
lim n f n ( ω ) - 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] = 0 , P - a . e . , ω D ( 0 ) .
(39)
If P << Q, then
lim n f n ( ω ) - 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] = 0 , P - a . e .
(40)
In particular, if P = Q,
lim n f n ( ω ) - 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b - 1 | X 1 m ( t ) ) ] = 0 , Q - a . e .
(41)

Proof Letting c = 0 in (30) and (31), Equation 39 follows. If P << Q, then h(P | Q) = 0, P - a.e.,(cf. see [15],P.121), i.e., P(D(0)) = 1. Hence, Equation 40 follows from (39). In particular, if P = Q, then h(P | Q) ≡ 0. Hence, (41) follows from (40). □

Theorem 3 Under the conditions of Theorem 1, if {g n (y1,.... y m +1), n ≥ 1} is uniformly bounded, i.e., there exists M > 0 such that |g n (y1,..., y m +1)| ≤ M, then when c ≥ 0, we have
lim sup n 1 | T ( n ) | | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } | M ( c + 2 c ) , P - a . e . , ω D ( c ) .
(42)
Proof By (20) and (12) and the formula in line 2 of (22), we have
lim sup n λ | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } lim sup n 1 | T ( n ) | k = 1 n t L k E Q [ e λ g k ( X 0 m ( t ) ) - 1 - λ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] + c P - a . e . , ω D ( c ) .
(43)
By the hypothesis of the theorem and the inequality e x - 1 - x ≤ |x|(e|x|- 1), we have
e λ g k ( X 0 m ( t ) ) - 1 - λ g k ( X 0 m ( t ) ) | λ | M ( e | λ | M - 1 ) .
(44)
By (43) and (44)
lim sup n λ | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } | λ | M ( e | λ | | M - 1 ) + c , P - a . e . , ω D ( c ) .
(45)
When λ > 0, we have by (45)
lim sup n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } M ( e λ M - 1 ) + c / λ , P - a . e . , ω D ( c ) .
(46)
Taking λ = 1 M log ( 1 + c ) , and using the inequality
log ( 1 + c ) c 1 + c ,
(47)
we have when c > 0
lim sup n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } M c + c M log ( 1 + c ) M ( 2 c + c ) , P - a . e . , ω D ( c ) .
(48)
When λ < 0, it follows from (45) that
lim inf n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } - M ( e λ M - 1 ) + c / λ P - a . e . , ω D ( c ) .
(49)
Taking λ = - 1 M log ( 1 + c ) in (49), and using (47), we have when c > 0
lim inf n 1 | T ( n ) | k = 1 n t L k { g k ( X 0 m ( t ) ) - E Q [ g k ( X 0 m ( t ) ) | X 1 m ( t ) ] } - M c - c M log ( 1 + c ) - M ( 2 c + c ) , P - a . e . , ω D ( c ) .
(50)

In a similar way, it can be shown that (48) and (50) also hold when c = 0. By (48) and (50), we have (42) holds. This completes the proof of this theorem.□

Corollary 2 Under the conditions of Theorem 1, let g(y1,..., y m +1) be any function defined on G m +1. Let M = max g(y1,..., y m +1). Then when c ≥ 0,
lim sup n 1 | T ( n ) | k = 1 n t L k { g ( X 0 m ( t ) ) - E Q [ g ( X 0 m ( t ) ) | X 1 m ( t ) ] } M ( c + 2 c ) , P - a . e . , ω D ( c ) .
(51)

Proof Letting g(y1,..., y m +1) = g n (y1,..., y m +1), n ≥ 1 in Theorem 3, this corollary follows.

In the following, let I k ( x ) = 1 x = k 0 x k . Let S T ( n ) \ o m ( i 1 , , i m ) be the number of (i1,..., i m ) in the collection of { X 0 m - 1 ( t ) , t T ( n ) \ o m } , that is
S T ( n ) \ o m ( i 1 , , i m ) = k = 0 n t L k I i 1 ( X ( m - 1 ) t ) I i m ( X t ) ,
(52)
S T ( n ) \ o m ( i 1 , , i m , i m + 1 ) be the number of (i1,..., i m , i m +1) in the collection of { X 0 m ( t ) , t T ( n ) \ o m } , that is
S T ( n ) \ o m ( i 1 , , i m , i m + 1 ) = k = 1 n t L k I i 1 ( X m t ) I i m + 1 ( X t ) .
(53)
Corollary 3 Let {X t , t T} be defined as before. Then for all i1,..., i m +1 G, c ≥ 0, we have
lim sup n | S T ( n ) \ o m ( i 1 , , i m ) | T ( n ) | - 1 | T ( n - 1 ) | l G k = 0 n - 1 t L k I l ( X ( m - 1 ) t ) . I i 1 ( X ( m - 2 ) t ) I i m - 1 ( X t ) q k + 1 ( i m | l , i 1 , , i m - 1 ) | c + 2 c , P - a . e . , ω D ( c ) .
(54)
lim sup n | S T ( n ) \ o m ( i 1 , , i m + 1 ) | T ( n ) | - 1 | T ( n - 1 ) | k = 0 n - 1 t L k I i 1 ( X ( m - 1 ) t ) . I i 2 ( X ( m - 2 ) t ) I i m ( X t ) q k + 1 ( i m + 1 | i 1 , , i m ) | c + 2 c , P - a . e . , ω D ( c ) .
(55)
Proof Letting g ( y 1 , , y m + 1 ) = I i 1 ( y 2 ) I i m ( y m + 1 ) in Corollary 2.
k = 1 n t L k g ( X 0 m ( t ) ) = k = 1 n t L k I i 1 ( X ( m - 1 ) t ) I i m ( X t ) = S T ( n ) \ o m ( i 1 , , i m ) - I i 1 ( X - ( m - 1 ) ) I i m ( X o ) ,
(56)
and
k = 1 n t L k E Q [ g ( X 0 m ( t ) ) | X 1 m ( t ) ] = k = 1 n t L k x t G g ( X 1 m ( t ) , x t ) q k ( x t | X 1 m ( t ) ) = k = 1 n t L k x t G I i 1 ( X ( m - 1 ) t ) I i m - 1 ( X 1 t ) I i m ( x t ) q k ( x t | X 1 m ( t ) ) = k = 1 n t L k I i 1 ( X ( m - 1 ) t ) I i m - 1 ( X 1 t ) q k ( i m | X 1 m ( t ) ) = l G k = 1 n t L k I l ( X m t ) I i 1 ( X ( m - 1 ) t ) I i m - 1 ( X 1 t ) q k ( i m | l , i 1 , , i m - 1 ) = N l G k = 0 n - 1 t L k I l ( X ( m - 1 ) t ) I i 1 ( X ( m - 2 ) t ) I i m - 1 ( X t ) q k + 1 ( i m | l , i 1 , , i m - 1 ) .
(57)

Noticing that M = max g(y1,..., y m +1) = 1, lim n | T ( n - 1 ) | | T ( n ) | = 1 N , by (56) and (57) and Corollary 2, (54) holds. Similarly, we let g ( y 1 , , y m + 1 ) = I i 1 ( y 1 ) I i m + 1 ( y m + 1 ) , (55) follows.

Corollary 4 Let {X t , t T} be defined as before.
lim n 1 | T ( n ) | k = 1 n t L k { g ( X 0 m ( t ) ) - E Q [ g ( X 0 m ( t ) ) | X 1 m ( t ) ] } = 0 , P - a . e . , ω D ( 0 ) ,
(58)
lim n { S T ( n ) \ o m ( i 1 , , i m ) | T ( n ) | 1 | T ( n 1 ) | l G k = 0 n 1 t L k I l ( X ( m 1 ) t ) . I i 1 ( X ( m 2 ) t ) I i m 1 ( X t ) q k + 1 ( i m | l , i 1 ,..., i m 1 ) } = 0 , P a . e ., ω D ( 0 ) ,
(59)
lim n { S T ( n ) \ o m ( i 1 , , i m + 1 ) | T ( n ) | 1 | T ( n 1 ) | k = 0 n 1 t L k I i 1 ( X ( m 1 ) t ) . I i 2 ( X ( m 2 ) t ) I i m ( X t ) q k + 1 ( i m + 1 | i 1 , , i m ) } = 0 , P a . e . , ω D ( 0 ) .
(60)

If P = Q, then above equations hold Q - a.e..

Proof Letting c = 0 in Corollary 2 and Corollary 3, (58)-(60) follow from (51),(54) and (55). In particular, if P = Q, then h(P|Q) = 0, so (58)-(60) hold P - a.e., hence hold Q - a.e.

Definition 3 Let G = {0, 1,..., b - 1} be a finite state space and
Q 1 = ( q ( j | i m ) ) , j G , i m G m
(61)
be an m th-order transition matrix. Define a stochastic matrix as follows:
Q ̄ 1 = ( q ( j m | i m ) ) , i m , j m G m ,
(62)
where
q ( j m | i m ) = q ( j m | i m ) , if j v = i v + 1 , v = 1 , 2 , , m - 1 , 0 , otherwise .
(63)

Then Q ̄ 1 is called an m-dimensional stochastic matrix determined by the m th-order transition matrix.Q1.

Lemma 2 (see [16]). Let Q ̄ 1 be an m-dimensional stochastic matrix determined by the m th-order transition matrix Q1. If the elements of Q1 are all positive, that is
Q 1 = ( q ( j | i m ) ) , q ( j | i m ) > 0 , j G , i m G m ,
(64)

then Q ̄ 1 is ergodic.

Theorem 4 Let {X t , t T} be defined as Theorem 1. Let S T ( n ) \ o m ( i 1 , , i m ) = S T ( n ) \ o m ( i m ) , S T ( n ) \ o m ( i 1 , , i m , i m + 1 ) = S T ( n ) \ o m ( i m + 1 ) and f n (ω) defined by (52),(53) and (2), respectively. Let h(P|Q) and D(c) be defined by (4) and (12), respectively. Let the m th-order transition matrices defined by (6) be changeless with n, that is
Q n = Q 1 = ( q ( j | i m ) ) ,
(65)
or {X t , t T} is an m th-order homogeneous Markov chain indexed by tree T with the m th-order transition matrix Q1 under the probability measure Q. Let the m-dimensional stochastic matrix Q ̄ 1 determined by Q1 be ergodic. Then for all i1,..., i m +1 G, we have
lim n S T ( n ) \ o m ( i m ) | T ( n ) | = π ( i m ) , P - a . e . , ω D ( 0 ) .
(66)
lim n S T ( n ) \ o m ( i m + 1 ) | T ( n ) | = π ( i m ) q ( i m + 1 | i m ) , P - a . e . , ω D ( 0 ) .
(67)
lim n f n ( ω ) = - i m G m j G π ( i m ) q ( j | i m ) ln q ( j | i m ) , P - a . e . , ω D ( 0 ) .
(68)

where {π(i m ), i m G m } is the stationary distribution determined by Q ̄ 1 .

Proof Proof of Equation 66. Let k m = (k1,..., k m ). If (65) holds, then we have by (63) and (52)
l G k = 0 n - 1 t L k I l ( X ( m - 1 ) t ) I i 1 ( X ( m - 2 ) t ) I i m - 1 ( X t ) q k + 1 ( i m | l , i 1 , , i m - 1 ) = l G k = 0 n - 1 t L k I l ( X ( m - 1 ) t ) I i 1 ( X ( m - 2 ) t ) I i m - 1 ( X t ) q ( i m | l , i 1 , , i m - 1 ) = l G S T ( n - 1 ) \ o m ( l , i 1 , , i m - 1 ) q ( i m | l , i 1 , , i m - 1 ) = k m G m S T ( n - 1 ) \ o m ( k m ) q ( i m | k m ) .
(69)
By (59) and (69), we have
lim n S T ( n ) \ o m ( i m ) | T ( n ) | - 1 | T ( n - 1 ) | k m G m S T ( n - 1 ) \ o m ( k m ) q ( i m | k m ) = 0 , P - a . e . , ω D ( 0 ) .
(70)
Multiplying (70) by q(j m |i m ), adding them together for i m G m , and using (70) once again, we have
0 = i m G m q ( j m | i m ) lim n S T ( n ) \ o m ( i m ) | T ( n ) | - 1 | T ( n - 1 ) | k m G m S T ( n - 1 ) \ o m ( k m ) q ( i m | k m ) = lim n i m G m S T ( n ) \ o m ( i m ) | T ( n ) | q ( j m | i m ) - S T ( n + 1 ) \ o m ( j m ) | T ( n + 1 ) | + lim n S T ( n + 1 ) \ o m ( j m ) | T ( n + 1 ) | - 1 | T ( n - 1 ) | k m G m S T ( n - 1 ) \ o m ( k m ) i m G m q ( j m | i m ) q ( i m | k m ) = lim n S T ( n + 1 ) \ o m ( j m ) | T ( n + 1 ) | - 1 | T ( n - 1 ) | k m G m S T ( n - 1 ) \ o m ( k m ) q ( 2 ) ( j m | k m ) , P - a . e . , ω D ( 0 ) .
By induction, we have
lim n S T ( n + N ) \ o m ( j m ) | T ( n + N ) | - 1 | T ( n - 1 ) | k m G m S T ( n - 1 ) \ o m ( k m ) q ( N + 1 ) ( j m | k m ) = 0 , P - a . e . , ω D ( 0 ) .
(71)
where q(h)(j m |k m ) is the h th step probability determined by Q ̄ 1 . We have by ergodicity
lim N q ( N + 1 ) ( j m | k m ) = π ( j m ) , k m G m ,
(72)

and k m G m S T ( n - 1 ) \ o m ( k m ) = | T ( n - 1 ) | - ( m - 1 ) . (66) follows from (71) and (72). By (66) and (60), Equation 67 follows easily.

Proof of Equation 68. By (66) and (53), we have
k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( b 1 | X 1 m ( t ) ) ) ] = k = 1 n t L k H [ q ( 0 | X 1 m ( t ) ) , , q ( b 1 | X 1 m ( t ) ) ) ] = k = 1 n t L k j G q ( j | X 1 m ( t ) ) ln q ( j | X 1 m ( t ) ) = k = 1 n t L k j G i m G m I i 1 ( X m t ) I i m ( X 1 t ) q ( j | i m ) ln q ( j | i m ) = N k = 0 n 1 t L k j G i m G m I i 1 ( X ( m 1 ) t ) I i m ( X t ) q ( j | i m ) ln q ( j | i m ) = N j G i m G m S T ( n 1 ) \ o m ( i m ) q ( j | i m ) ln q ( j | i m ) .
(73)

Noticing that lim n | T ( n - 1 ) | | T ( n ) | = 1 N , by (39), (73) and (66), Equation 68 follows.□

3. Shannon-McMillan Theorem

Theorem 5 Let {X t , t T} be a G-valued m th-order nonhomogeneous Markov chain indexed by an m rooted Cayley tree under the probability measure Q with initial distribution (5) and m th-order transition matrices (6). Let S T ( n ) \ o m ( i m ) , S T ( n ) \ o m ( i m + 1 ) and f n (ω) be defined as before. Let
Q n = Q 1 = ( q ( j | i m ) ) , q ( j | i m ) > 0 , i m G m , j G ,
(74)
be another positive m th-order transition matrix. Let Q ̄ 1 be an m dimension transition matrix determined by Q1. If
lim n q n ( j | i m ) = q ( j | i m ) , i m G m , j G ,
(75)
then
lim n S T ( n ) \ o m ( i m ) | T ( n ) | = π ( i m ) , P - a . e . ω D ( 0 ) ,
(76)
lim n S T ( n ) \ o m ( i m + 1 ) | T ( n ) | = π ( i m ) q ( i m + 1 | i m ) , P - a . e . ω D ( 0 ) ,
(77)
lim n f n ( ω ) = - i m G m j G q ( j | i m ) ln q ( j | i m ) , P - a . e . ω D ( 0 ) ,
(78)

where {π(i m ), i m G m } is the stationary distribution determined by Q ̄ 1 . In particular, if P = Q, then above equations hold Q - a.e.

Proof By (59), (75), (52) and (66), (76) follows immediately. Similarly, by (60), (75), and (53), (77) follows. It follows from (75) and Cesaro average that
lim n 1 | T ( n ) | k = 1 n t L k | q k ( j | i m ) ln q k ( j | i m ) - q k ( j | i m ) ln q k ( j | i m ) | = 0 , i m G m , j G .
(79)
Notice that
| 1 | T ( n ) | k = 1 n t L k H [ q k ( 0 | X 1 m ( t ) ) , , q k ( 0 | X 1 m ( t ) ) ] - 1 | T ( n ) | k = 1 n t L k H [ q ( 0 | X 1 m ( t ) ) , , q ( 0 | X 1 m ( t ) ) ] | = | - 1 | T