Admissibility of simultaneous prediction for actual and average values in finite population

This paper studies the admissibility of simultaneous prediction of actual and average values of the regressand in the generalized linear regression model under the quadratic loss function. Necessary and sufficient conditions are derived for the simultaneous prediction to be admissible in classes of homogeneous and nonhomogeneous linear predictors, respectively.


Introduction
We begin this paper with some notation first. For an m × n matrix A, the symbols M(A), A , Aand A + denote the column space, the transpose, the generalized inverse and Moore-Penrose inverse of A, respectively. For squared matrices B and C, B ≥ C means that B -C is a symmetric nonnegative definite matrix and B -C ≥ 0. Let rk(A) be the rank of A and tr(A) be the trace of A when A is a squared matrix. I is an identity matrix with an appropriate order and R n denotes the n dimensional vector set.
Consider the following generalized linear regression model: where y is the n × 1 observable vector of regressand, X is the n × p observable matrix of regressor, β is a p × 1 unknown vector of regression coefficient, ε is the n × 1 vector of disturbances and ≥ 0. Suppose rk(X) ≤ p. Given the matrix of regressor X 0 (which is correlated with new observations), the relationship between the unobservable random vector y 0 and X 0 is where y is the m × 1 vector of the regressand to be predicted, X 0 is the m × p matrix of prediction regressor, β is the same as that in model (1), ε 0 is the m × 1 vector of prediction disturbances and 0 ≥ 0. Assume y and y 0 are correlated and Cov(ε 0 , ε ) = V. Suppose the finite population is composed of y and y 0 . Thus, combining models of (1) and (2), we have where y T = y y 0 , For the prediction in model (3), [1] obtained the best linear unbiased predictor (BLUP) of y 0 . [2] considered the optimal Stein-rule prediction. Reference [3] investigated the admissibility of linear predictors with inequality constraints under the quadratic loss function. [4] reviewed the existing theory of minimum mean squared error (MSE) predictors and made an extension based on the principle of equivariance. [5] derived the BLUP and the admissible predictor under the matrix loss function. Under the MSE loss function, the optimal predictor of y 0 is the conditional expectation E(y 0 |X 0 ) = X 0 β, which relates naturally to the plug-in estimators of β. [6] proposed the simple projection predictor (SPP) of X 0 β by plugging in the best linear unbiased estimator (BLUE) of β. The plug-in approach spawned a large literature for the derivation of combined prediction; see [7,8], etc.
Generally, predictions are investigated either for y 0 or for Ey 0 at a time. However, sometimes in the fields of medicine and economics, people would like to know the actual value of y 0 and its average value Ey 0 simultaneously. For example, in the financial markets, some investors may want to know the actual profit while others would be more interested in the average profit. Therefore, in order to meet different requirements, the market manager should acquire both the prediction of the actual profit and the prediction of the average profit at the same time, and can assign different weights to each prediction to provide a more comprehensive combined prediction of the profit. Under these circumstances, we consider the following target function: where λ ∈ [0, 1] is a non-stochastic scalar representing the preference to the prediction of the actual and average values of the studied variable. A prediction of δ is the simultaneous prediction of the actual and average values of y 0 . Note that δ = y 0 if λ = 1 and δ = Ey 0 if λ = 0, which means the studied δ is more sensitive and inclusive.
Studies on the simultaneous prediction of the actual and average values of the studied variable (namely prediction of δ) have been carried out in the literature from various perspective. The properties of the predictors by plugging in Stein-rule estimators have been concerned by [9] and [10]. [11] investigated the Stein-rule prediction for δ in linear regression model when the error covariance matrix was positive definite yet unknown. References [12,13] and [14] considered predictors for δ in linear regression models with stochastic or non-stochastic linear constraints on the regression coefficients. The issues of simultaneous prediction in measurement error models have been addressed in [15] and [16]. [17] considered a matrix multiple of the classical forecast vector for the simultaneous prediction of and discussed the performance properties.
This paper aims to study the admissibility of simultaneous prediction of actual and average values of the unobserved regressand in finite population under the quadratic loss function. Admissibility is an interesting problem in statistical theory and received much attention. [18,19] and [20] discussed the admissibility of predictions of y T . [21,22] and [23] studied the admissibility of estimations of β. We discuss the admissible predictors of δ in classes of homogeneous and nonhomogeneous linear predictors, respectively. Necessary and sufficient conditions for the simultaneous prediction to be admissible are provided.
The rest of this paper is organized as follows. In Sect. 2, we give some preliminaries. In Sect. 3, we obtain the homogeneous linear admissible simultaneous predictors for the actual and average values of the unobserved regressand. In Sect. 4, we derive the necessary and sufficient conditions for linear simultaneous prediction to be admissible in class of nonhomogeneous linear predictors. Concluding remarks are placed in Sect. 5.

Preliminaries
Suppose d is the predictor of δ and denote R(d; β) as the risk of d under the quadratic loss function, then for model (3) Denote the classes of homogeneous and nonhomogeneous linear predictors, respectively, by LH = {Cy | C is an m × n matrix}, and LN = {Cy + u | C is an m × n matrix and u is an m × 1 nonstochastic vector and u = 0}.
The nonhomogeneous linear predictor is actually an adjustment of the homogeneous linear predictor. We study the admissibility of the prediction of δ in LH and LN . Before the discussion begins, we first present some important preliminaries and basic results.

Lemma 2.3 Suppose Cy is an arbitrary predictor of δ in model (3). Let
where T = XX + . Then under the quadratic loss function for every β ∈ R p and the equation holds if and only if either of the following two conditions holds: Proof By direct calculation, Note that Since the equation holds for every β ∈ R p if and only if Now we prove conditions (1) and (2) are equivalent. First, as C = C , Then Therefore, we complete the proof.

Lemma 2.4 If C = C , then the risk of Cy under the quadratic loss function is
The lemma is easily proved by substitution of these equations.

Admissibility of homogeneous linear predictors
In this section, we derive the necessary and sufficient conditions for the admissibility of simultaneous prediction in class of the homogeneous linear predictors. The best linear unbiased predictor of δ is obtained. Examples are presented to give some admissible predictors.
Theorem 3.1 Let l = q C, where C is a matrix and l, q are vectors with appropriate dimensions. If Cy ∼ δ under the quadratic loss function, then Proof Since Cy ∼ δ and l = q C, by Lemma 2.1, l y ∼ q δ for any 1 × m vector q. Suppose k be a real constant and 0 < k < 1. Let The risk of l k y is Since l y ∼ q y 0 and by Lemma 2.3, Lemma 2.4 and l = q C, the risk of l y is and for any k ∈ (0, 1), R(l y; β) ≤ R(l k y; β). It means that R l y; β -R l k y; β Divide both sides of the above inequality by 1k and then let k → 1, we have

Theorem 3.2 For the model (3), Cy ∼ δ in LH under the quadratic loss function if and only if
Proof Necessity: (i) The condition (1) is shown in Lemma 2.3; (ii) Since Cy ∼ δ, then η Cy ∼ η δ for any η ∈ R m by Lemma 2.2. Let l = η C and q = η in Theorem 3.1, we have To prove condition (2), we can only prove that λ(CX -X 0 )(X T + X) -X T + V + CXQX 0 is symmetric from (4). Using reduction to absurdity and suppose λ(CX -X 0 )(X T + X) -X T + V + CXQX 0 is not a symmetric matrix. With this assumption and the fact that CXQX C and (CX -X 0 )Q(CX -X 0 ) are both symmetric positive semi-definite matrices, we have is not symmetric and hence is not symmetric. By Lemma 2.2, there exists an orthogonal matrix P such that Let It is easy to prove that H = H . Then with Lemma 2.4 and (5), we have R(Hy; β) = R( Hy; β) which contradicts the admissibility of Cy. Thus, -(CX -X 0 )QX 0 + λ(CX -X 0 ) X T + X -X T + V is symmetric and then is symmetric. From (4), condition (2) is proved.
(b) MX = X 0 . If MX = CX, then R(My; β) = R (Cy; β). So we only need to discuss this issue on the case that MX = CX. By Lemma 2.4 and for any β, we only need to consider the admissibility under this circumstance that It can be shown from (6) where Z = (CX -X 0 )(CX -X 0 ) + (CX -X 0 )(MX -X 0 ) . (9) and condition (2), we have which means My cannot be better than Cy. In summary, the proof is complected.
Remark 3.1 As δ = y 0 when λ = 1 and δ = Ey 0 when λ = 0, it is convenient to obtain the sufficient and necessary conditions for the predictors of y 0 and Ey 0 to be admissible from Theorem 3.2 and Corollary 3.1. (3) and under the quadratic loss function, the best linear unbiased predictors of δ, y 0 and Ey 0 arê δ BLUP = Cy = X 0 X T + X -X T + y + λVT + y -X X T + X -X T + y , y 0 BLUP = X 0 X T + X -X T + y + VT + y -X X T + X -X T + y ,

Corollary 3.2 For model
Proof Let Cy +u be the linear unbiased predictor of δ. E(Cy +u) = Eδ gives that CX = X 0 and u = 0. Let C = X 0 (X T + X) -X T + + λVT + [I -X(X T + X) -X T + ], the corollary is easily proved by verifying the conditions in Theorem 3.2.
We give some admissible predictors in the following examples.
Remark 3.2 Example 3.1 indicates that the unbiased predictors of δ are also the unbiased predictors of y 0 and Ey 0 , and the unbiased predictors of y 0 and Ey 0 are also the unbiased predictors of δ since Eδ = Ey 0 = E(Ey 0 ) = X 0 β. However, admissibility of those predictors for each studied variables are different.
Example 3.2 Suppose V = 0, > 0 and rk(X) < p in model (3). Suppose t max be the maximum eigenvalue of X X, non-stochastic scalars k > 0 and θ = t max +k t max . Let where B is an m × n arbitrary matrix. If X 0 = θ DX, then by Corollary 3.1 without tedious calculations, X 0 (X X + kI p ) -1 X y ∼ δ.
Remark 3.3 Denoteβ ridge as the ridge estimator of β in model (1) when > 0 and rk(X) < p. Example 3.2 indicates that in particular linear regression model, if X 0 and X have some relations, we can use the ridge predictor X 0β ridge as the admissible predictor for δ, especially for y 0 and Ey 0 .

Admissibility of nonhomogeneous linear predictors
In this section, we investigate the admissibility of simultaneous prediction in class of nonhomogeneous linear predictors, and we obtain the necessary and sufficient conditions. Studies show the admissibility of simultaneous prediction in the class of nonhomogeneous linear predictors is based on the admissibility of simultaneous prediction in the class of homogeneous linear predictors.
Remark 4.1 Theorem 4.1 shows the relation between the admissible homogeneous and nonhomogeneous linear predictors, and indicates that the admissibility of the homogeneous linear predictor is more significant. To derive an admissible predictor Cy + u in LN , we can derive the admissible predictor Cy in LH beforehand.

Concluding remarks
In this paper, we investigate the admissibility of linear prediction in the generalized linear regression model under the quadratic loss function. Predictions are based on a composite target function that allows one to predict actual and average values of the unobserved regressand simultaneously, according to some practical needs. Necessary and sufficient conditions for the simultaneous prediction to be admissible are obtained in classes of homogeneous and nonhomogeneous linear predictors, respectively. Although the unbiased predictors of the composite target function are the unbiased predictors of the actual and average values of the unobserved regressand and vise versa, yet the admissibility of these predictors for each studied variables are different. Under some circumstances, the ridge predictor is admissible although it is biased. However, whether the admissible linear prediction is minimax under quadratic loss function is unclear. Further research on the minimaxity of admissible simultaneous prediction is in progress.