This paper is concerned with the parameter estimator in linear regression model. To overcome the multicollinearity problem, two new classes of estimators called the almost unbiased ridge-type principal component estimator (AURPCE) and the almost unbiased Liu-type principal component estimator (AULPCE) are proposed, respectively. The mean squared error matrix of the proposed estimators is derived and compared, and some properties of the proposed estimators are also discussed. Finally, a Monte Carlo simulation study is given to illustrate the performance of the proposed estimators.
1. Introduction
Consider the following multiple linear regression model:
(1)y=Xβ+ε,
where y is an n×1 vector of responses, X is an n×p known design matrix of rank p, β is a p×1 vector of unknown parameters, ε is an n×1 vector of disturbances assumed to be distributed with mean vector0and variance covariance matrix σ2In, and In is an identity matrix of order n.
According to the Gauss-Markov theorem, the ordinary least squares estimate (OLSE) of (1) is obtained as follows:
(2)β^=(X′X)-1X′y.
It has been treated as the best estimator for a long time. However, many results have proved that the OLSE is no longer a good estimator when the multicollinearity is present. To overcome this problem, many new biased estimators have been proposed, such as principal components regression estimator (PCRE) [1], ridge estimator [2], Liu estimator [3], almost unbiased ridge estimator [4], and the almost unbiased Liu estimator [5].
To hope that the combination of two different estimators might inherit the advantages of both estimators, Kaçıranlar et al. [6] improved Liu’s approach and introduced the restricted Liu estimator. Akdeniz and Erol [7] compared some biased estimators in linear regression in the mean squared error matrix (MSEM) sense. By combining the mixed estimator and Liu estimator, Hubert and Wijekoon [8] obtained the two-parameter estimator which is a general estimator including the OLSE, ridge estimator, and Liu estimator. Baye and Parker [9] proposed the r-k class estimator which includes as special cases the PCRE, the RE, and the OLSE. Then, Kaçıranlar and Sakallıoğlu [10] proposed the r-d estimator which is a generalization of the OLSE, PCRE, and Liu estimator. Based on the r-k estimator and r-d estimator, Xu and Yang [11] considered the restricted r-k estimator and restricted r-d estimator and Wu and Yang [12] introduced the stochastic restricted r-k estimator and the stochastic restricted r-d estimator, respectively.
The primary aim in this paper is to introduce two new classes of estimators where one includes the OLSE, PCRE, and AURE as special cases and the other one includes the OLSE, PCRE, and AULE as special cases and provide some alternative methods to overcome multicollinearity in linear regression.
The paper is organized as follows. In Section 2, the new estimators are introduced. In Section 3, some properties of the new estimator are discussed. Then we give a Monte Carlo simulation in Section 4. Finally, some conclusions are given in Section 5.
2. The New Estimators
In the linear model given by (1), the almost unbiased ridge estimator (AURE) proposed by Singh et al. [4] and the almost unbiased Liu estimator (AULE) proposed by Akdeniz and Kaçıranlar [5] are defined as
(3)β^AU(k)=(I-k2(S+kI)-2)β^,(4)β^AULE(d)=(I-(1-d)2(S+I)-2)β^,
respectively, where k>0,0<d<1, S=X′X.
Now consider the spectral decomposition of the matrix given as
(5)X′X=(TrTp-r)(Λr00Λp-r)(Tr′Tp-r′),
where Λr=diag(λ1,…,λr), Λp-r=diag(λr+1,…,λp-r) and λ1≥λ2≥⋯≥λp>0 are the ordered eigenvalues of S. The matrix T=(Tr⋮Tp-r)p×p is orthogonal with Tr=(t1,…,tr) consisting of its first r columns and Tp-r=(tr+1,…,tp) consisting of the remaining p-r columns of the matrix T. Then Tr′STr=Λr; the PCRE of β can be written as
(6)β^r=Tr(Tr′STr)-1TrX′y=TrΛr-1TrX′y.
The r-k class estimator proposed by Baye and Parker [9] and the r-d class estimator proposed by Kaçıranlar and Sakallıoğlu [10] are defined as
(7)β^r(k)=Tr(Tr′STr+kIr)-1Tr′X′y=Tr(Λr+kIr)-1Tr′X′y,β^r(d)=Tr(Tr′STr+Ir)-1(Tr′STr+dIr)Tr′X′y=Tr(Λr+Ir)-1(Ir+dΛr-1)Tr′X′y.
Followed by Xu and Yang [11], the r-k class estimator and r-d class estimator can be rewritten as follows:
(8)β^r(k)=TrTr′T(Λ+kI)-1T′X′y=TrTr′β^(k)β^r(d)=TrTr′T(Λ+I)-1(I+dΛ-1)T′X′y=TrTr′β^(d),
where β^(k)=T(Λ+kI)-1T′X′y=(S+kI)-1X′y is the ridge estimator by Hoerl and Kennard [2] and β^(d)=T(Λ+I)-1(I+dΛ-1)T′X′y=(S+I)-1(I+dS-1)X′y is the Liu estimator proposed by Liu [3].
Now, we are to propose two new estimator classes by combining the PCRE with the AURE and AULE, that is, the almost unbiased ridge principal components estimator (AURPCE) and the almost unbiased Liu estimator principal component estimator (AULPCE), as follows:
(9)β^AU(r,k)=TrTr′(I-k2(S+kI)-2)β^=TrTr′Gkβ^,(10)β^AU(r,d)=TrTr′(I-(1-d)2(S+I)-2)β^=TrTr′Hdβ^,
respectively, where Gk=I-k2(S+kI)-2, Hd=I-(1-d)2(S+I)-2.
From the definition of the AURPCE, we can easily obtain the following.
Ifr=p, then β^AU(r,k)=β^AU(k),AURE.
Ifk=0, r=p, thenβ^SRAU(r,k)=β^,OLSE.
Ifk=0, thenβ^AU(r,k)=β^(r)=TrTr′β^,PCRE.
From the definition of the SRAULPCE, we can similarly obtain the following.
Ifr=p, thenβ^AU(r,d)=β^AU(d),AULE.
Ifd=0, r=p, thenβ^AU(r,d)=β^,OLSE.
Ifd=0, thenβ^AU(r,d)=TrTr′β^,PCRE.
So the β^AU(r,k) could be regarded as a generalization of PCRE, OLSE, and AURE, while β^AU(r,d) could be regarded as a generalization of PCRE, OLSE, and AULE.
Furthermore, we can compute that the bias, dispersion matrix, and mean squared error matrix of the new estimators β^AU(r,k) are
(11)Bias(β^AU(r,k))=E(β^AU(r,k))-β=(TrTr′Gk-I)βD(β^AU(r,k))=TrTr′Gk·Cov(β^)·Gk′TrTr′=σ2TrTr′GkS-1Gk′TrTr′,(12)MSEM(β^AU(r,k))=σ2TrTr′GkS-1Gk′TrTr′+(TrTr′Gk-I)ββ′(TrTr′Gk-I)′,
respectively.
In a similar way, we can get the MSEM of the β^AU(r,d) as follows:
(13)MSEM(β^AU(r,d))=σ2TrTr′HdS-1Hd′TrTr′+(TrTr′Hd-I)ββ′(TrTr′Hd-I)′.
In particular, if we let r=p in (12) and (13), then we can get the MSEM of the AURE and AULE as follows:
(14)MSEM(β^AU(k))=σ2GkS-1Gk′+(Gk-I)ββ′(Gk-I)MSEM(β^SRAU(d))=σ2HdS-1Hd′+(Hd-I)ββ′(Hd-I).
3. Superiority of the Proposed Estimators
For the sake of convenience, we first list some notations, definitions, and lemmas needed in the following discussion. For a matrix M, M′, M+, rank(M), R(M), and N(M)stand for the transpose, Moore-Penrose inverse, rank, column space, and null space, respectively. M≥0 means that M is nonnegative definite and symmetric.
Lemma 1.
Let Cn×p be the set of n×p complex matrices, let Hn×n be the subset of Cn×p consisting of Hermitian matrices, and L∈Cn×p,L*,𝔐(L), and 𝔍(D) stand for the conjugate transpose, the range, and the set of all generalized inverses, respectively. Let D∈Hn×n,a1 and a2∈Cn×1 be linearly independent, fij=ai*D-aj, i,j=1,2, and if a2∉𝔐(D), let s=[a1*(I-DD-)*(I-DD-)a2]/[a1*(I-DD-)*(I-DD-)a1]. Then D+a1a1*-a2a2*≥0 if and only if one of the following sets of conditions holds:
where (U⋮v) is a subunitary matrix (U possibly absent), Δ a positive-definite diagonal matrix (occurring when U is present), and λ a positive scalar. Further, all expressions in (a), (b), and (c) are independent of the choice of D-∈𝔍(D).
Proof.
Lemma 1 is due to Baksalary and Trenkler [13].
Let us consider the comparison between the AURPCE and AURE and the AULPCE and AULE, respectively. From (12)–(14), we have
(15)Δ1=MSEM(β^AU(k))-MSEM(β^AU(r,k))=D1+b1b1′-b2b2′Δ2=MSEM(β^AU(d))-MSEM(β^AU(r,d))=D2+b3b3′-b4b4′,
where D1=σ2(GkS-1Gk′-TrTr′GkS-1Gk′TrTr′), D2=σ2(HdS-1Hd′-TrTr′HdS-1Hd′TrTr′) and b1=(Gk-I)β, b2=(TrTr′Gk-I)β, b3=(Hd-I)β, b2=(TrTr′Hd-I)β.
Now, we will use Lemma 1 to discuss the differences Δ1 and Δ2 following Sarkar [14] and Xu and Yang [11]. Since
(16)S-1=(TrTr′+Tp-rTp-r′)S-1(TrTr′+Tp-rTp-r′)=TrTr′S-1TrTr′+Tp-rTp-r′S-1Tp-rTp-r′+TrTr′S-1Tp-rTp-r′+Tp-rTp-r′S-1TrTr′,
we assume that Tr′S-1Tp-r=0 and Tp-r′S-1Tp-r is invertible; then
(17)S-1=TrTr′S-1TrTr′+Tp-rTp-r′S-1Tp-rTp-r′.
Meanwhile, it is noted that the assumptions are reasonable which is equivalent to the partitioned matrix T′S-1T=(Tr′Tp-r′)S-1(TrTp-r)=(Tr′S-1TrTr′S-1Tp-rTp-r′S-1TrTp-r′S-1Tp-r), that is, a block diagonal matrix and the second main diagonal being invertible.
Theorem 2.
Suppose that Tr′S-1Tp-r=0 and Tp-r′S-1Tp-r is invertible; then the AURPCE is superior to the AURE if and only if β∈N(F), where F=σ-1(Tp-r′S-1Tp-r)-1/2Tp-r′.
Proof.
Since
(18)b1=Tr[(I-k2(Λr+kI)-2)-I]Tr′β+Tp-r[(I-k2(Λp-r+kI)-2)-I]Tp-r′βb2=Tr[(I-k2(Λr+kI)-2)-I]Tr′β-Tp-rTp-r′βGkS-1Gk′=Tr(I-k2(Λr+kI)-2)Tr′S-1×Tr(I-k2(Λr+kI)-2)Tr′+Tp-r(I-k2(Λp-r+kI)-2)Tp-r′S-1×Tp-r(I-k2(Λp-r+kI)-2)Tp-r′,TrTr′GkS-1Gk′TrTr′=TrTr′Tr(I-k2(Λr+kI)-2)×Tr′S-1Tr(I-k2(Λr+kI)-2)Tr′TrTr′=Tr(I-k2(Λr+kI)-2)Tr′S-1Tr×(I-k2(Λr+kI)-2)Tr′,
then we have
(19)D1=σ2Tp-r(I-k2(Λp-r+kI)-2)Tp-r′S-1Tp-r×(I-k2(Λp-r+kI)-2)Tp-r′.
And the Moore-Penrose inverse D1+ of D1 is
(20)D1+=σ-2Tp-r(I-k2(Λp-r+kI)-2)-1(Tp-r′S-1Tp-r)-1×(I-k2(Λp-r+kI)-2)-1Tp-r′.
Note that D1D1+=Tp-rTp-r′=I-TrTr′, I-k2(Λp-r+kI)-2, is a positive definition matrix since Λp-r is supposed to be invertible and D1D1+a1≠a1, so a1∉𝔐(D). Moreover,
(21)b2-b1=-Tp-r(I-k2(Λp-r+kI)-2)Tp-r′β=D1η1,
where η1=-σ-2Tp-r(I-k2(Λp-r+kI)-2)-1(Tp-r′S-1Tp-r)-1Tp-r′β. This implies that b2∈𝔐(D1⋮b1). So the conditions of part (b) in Lemma 1 can be employed. Since(I-DD-)′(I-DD-)=TrTr′TrTr′=TrTr′ and Tr′b2=Tr′b1, it is concluded that s=1 in our case. Thus, it follows from Lemma 1 that the β^AU(r,k) is superior to β^AU(k) in the MSEM sense if and only if (b2-b1)′D1-(b2-b1)=η1′D1′D1-D1η1=η1′D1η1≤0. Observing that
(22)η1′D1η1=σ-2β′Tp-r(Tp-r′S-1Tp-r)-1Tp-r′β=β′F′Fβ≥0,
where F=σ-1(Tp-r′S-1Tp-r)-1/2Tp-r′, thus the necessary and sufficient condition turns out to beβ∈N(F).
Theorem 3.
Suppose that Tr′S-1Tp-r=0 and Tp-r′S-1Tp-r is invertible; then the new estimator AULPCE is superior to the AULE if and only if β∈N(F), where F=σ-1(Tp-r′S-1Tp-r)-1/2Tp-r′.
Proof.
In order to apply Lemma 1, we can similarly compute that
(23)Hd=Tr[(I-(1-d)2(Λr+I)-2)]Tr′+Tp-r[(I-(1-d)2(Λp-r+I)-2)]Tp-r′b3=Tr[(I-(1-d)2(Λr+I)-2)-I]Tr′β+Tp-r[(I-(1-d)2(Λp-r+I)-2)-I]Tp-r′βb4=Tr[(I-(1-d)2(Λr+I)-2)-I]Tr′β-Tp-rTp-r′βHdS-1Hd′=Tr(I-(1-d)2(Λr+I)-2)Tr′S-1Tr×(I-(1-d)2(Λr+I)-2)Tr′+Tp-r(I-(1-d)2(Λp-r+I)-2)Tp-r′S-1×Tp-r(I-(1-d)2(Λp-r+I)-2)Tp-r′,TrTr′HdS-1Hd′TrTr′=TrTr′Tr(I-(1-d)2(Λr+I)-2)Tr′S-1Tr×(I-(1-d)2(Λr+I)-2)Tr′TrTr′=Tr(I-(1-d)2(Λr+I)-2)Tr′S-1Tr×(I-(1-d)2(Λr+I)-2)Tr′.
Therefore, the Moore-Penrose inverse D2+ of D2 is given by
(24)D2+=σ-2Tp-r(I-(1-d)2(Λp-r+I)-2)-1(Tp-r′S-1Tp-r)-1×(I-k2(Λp-r+kI)-2)-1Tp-r′.
Since D2D2+=Tp-rTp-r′, thenb3∉𝔐(D). Moreover,
(25)b4-b3=-Tp-r(I-(1-d)2(Λp-r+I)-2)Tp-r′β=D2η2,
where η1=-σ-2Tp-r(I-(1-d)2(Λp-r+I)-2)-1(Tp-r′S-1Tp-r)-1Tp-r′β. This implies that b4∈𝔐(D2⋮b3). So the conditions of part (b) in Lemma 1 can be employed. Since (I-DD-)′(I-DD-)=TrTr′TrTr′=TrTr′ and Tr′b4=Tr′b3, it is concluded that s=1 in our case. Thus, it follows from Lemma 1 that the β^AU(r,d) is superior to β^AU(d) in the MSEM sense if and only if (b4-b3)′D2-(b4-b3)=η2′D2′D2-D2η2=η2′D2′η2≤0. Observing that
(26)η2′D2′η2=σ-2β′Tp-r(Tp-r′S-1Tp-r)-1Tp-r′β=β′F′Fβ≥0,
where F=σ-1(Tp-r′S-1Tp-r)-1/2Tp-r′, thus the necessary and sufficient condition turns out to be β∈N(F).
4. Monte Carlo Simulation
In order to illustrate the behaviour of the AURPCE and AULPCE, we perform a Monte Carlo simulation study. Following the way of Li and Yang [15], the explanatory variables and the observations on the dependent variable are generated by
(27)xij=(1-γ2)1/2ωij+γωi5,yi=(1-γ2)1/2ωij+γωi5,hhhhhhhhhhhhhhhhhhhhhi=1,2,…,100,j=1,2,3,4,
where ωij are independent standard normal pseudorandom numbers and γ is specified so that the correlation between any two explanatory variables is given by γ2. In this experiment, we choose r=2 and σ2=1. Let us consider the AURPCE, AULPCE, AURE, AULE, PCRE, and OLSE and compute their respective estimated MSE values with the different levels of multicollinearity, namely, γ=0.7,0.85,0.9,0.999 to show the weakly, strong, and severely collinear relationships between the explanatory variables (see Tables 1 and 2). Furthermore, for the convenience of comparison, we plot the estimated MSE values of the estimators when γ=0.999 in Figure 1.
MSE values of the OLSE, PCRE, AURE, and AURPCE.
k
0.00
0.10
0.30
0.40
0.50
0.80
0.90
1.00
γ=0.7
OLSE
0.0619
0.0619
0.0619
0.0619
0.0619
0.0619
0.0619
0.0619
PCRE
0.0285
0.0285
0.0285
0.0285
0.0285
0.0285
0.0285
0.0285
AURE
0.0619
0.0619
0.0619
0.0619
0.0619
0.0619
0.0619
0.0618
AURPCE
0.0285
0.0285
0.0285
0.0285
0.0285
0.0285
0.0285
0.0285
γ=0.85
OLSE
0.1085
0.1085
0.1085
0.1085
0.1085
0.1085
0.1085
0.1085
PCRE
0.0384
0.0384
0.0384
0.0384
0.0384
0.0384
0.0384
0.0384
AURE
0.1085
0.1085
0.1085
0.1085
0.1085
0.1084
0.1084
0.1083
AURPCE
0.0384
0.0384
0.0384
0.0383
0.0383
0.0383
0.0383
0.0383
γ=0.99
OLSE
1.4636
1.4636
1.4636
1.4636
1.4636
1.4636
1.4636
1.4636
PCRE
0.3522
0.3522
0.3522
0.3522
0.3522
0.3522
0.3522
0.3522
AURE
1.4636
1.4565
1.4116
1.3797
1.3441
1.2281
1.1889
1.1502
AURPCE
0.3522
0.3515
0.3464
0.3426
0.3381
0.3220
0.3161
0.3101
γ=0.999
OLSE
14.5437
14.5437
14.5437
14.5437
14.5437
14.5437
14.5437
14.5437
PCRE
3.3903
3.3903
3.3903
3.3903
3.3903
3.3903
3.3903
3.3903
AURE
14.5437
1.4399
6.0117
4.5727
3.5858
1.9800
1.6797
1.4430
AURPCE
3.3903
2.9735
1.8963
1.5285
1.2518
0.7514
0.6496
0.5673
MSE values of the OLSE, PCRE, AULE, and AULPCE.
d
0.00
0.10
0.20
0.40
0.50
0.70
0.90
1.00
γ=0.7
OLSE
0.0709
0.0709
0.0709
0.0709
0.0709
0.0709
0.0709
0.0709
PCRE
0.0303
0.0303
0.0303
0.0303
0.0303
0.0303
0.0303
0.0303
AULE
0.0709
0.0709
0.0709
0.0709
0.0709
0.0709
0.0709
0.0709
AULPCE
0.0303
0.0303
0.0303
0.0303
0.0303
0.0303
0.0303
0.0303
γ=0.85
OLSE
0.1085
0.1085
0.1085
0.1085
0.1085
0.1085
0.1085
0.1085
PCRE
0.0384
0.0384
0.0384
0.0384
0.0384
0.0384
0.0384
0.0384
AULE
0.1083
0.1083
0.1084
0.1084
0.1085
0.1085
0.1085
0.1085
AULPCE
0.0383
0.0383
0.0383
0.0383
0.0383
0.0384
0.0384
0.0384
γ=0.99
OLSE
1.4636
1.4636
1.4636
1.4636
1.4636
1.4636
1.4636
1.4636
PCRE
0.3522
0.3522
0.3522
0.3522
0.3522
0.3522
0.3522
0.3522
AULE
1.1502
1.2066
1.2583
1.3461
1.3814
1.4337
1.4603
1.4636
AULPCE
0.3101
0.3179
0.3249
0.3367
0.3414
0.3483
0.3518
0.3522
γ=0.999
OLSE
14.5437
14.5437
14.5437
14.5437
14.5437
14.5437
14.5437
14.5437
PCRE
3.3903
3.3903
3.3903
3.3903
3.3903
3.3903
3.3903
3.3903
AULE
1.4430
2.8578
4.5509
8.2191
9.9597
12.7929
14.3436
14.5437
AULPCE
0.5673
0.9193
1.3076
2.0980
2.4599
3.0381
3.3502
3.3903
Estimated MSE values of the AULE, AULPCE, AURE, AURPCE, OLSE, and PCRE.
From the simulation results shown in Tables 1 and 2 and the estimated MSE values of these estimators, we can see that for most cases, the AURPCE and AULPCE have smaller estimated MSE values than those of the AURE, AULE, PCRE, and OLSE, respectively, which agree with our theoretical findings. From Figure 1, the AURPCE and AULPCE also have more stable and smaller estimated MSE values. We can see that our estimator is meaningful in practice.
5. Conclusion
In this paper, we introduce two classes of new biased estimators to provide an alternative method of dealing with multicollinearity in the linear model. We also show that our new estimators are superior to the competitors in the MSEM criterion under some conditions. Finally, a Monte Carlo simulation study is given to illustrate the better performance of the proposed estimators.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (no. 11201505) and the Fundamental Research Funds for the Central Universities (no. 0208005205012).
MassyW. F.Principal components regression in exploratory statistical researchHoerlA. E.KennardR. W.Ridge regression: biased estimation for nonorthogonal problemsLiuK. J.A new class of biased estimate in linear regressionSinghB.ChaubeyY. P.DwivediT. D.An almost unbiased ridge estimatorAkdenizF.KaçıranlarS.On the almost unbiased generalized Liu estimator and unbiased estimation of the bias and MSEKaçıranlarS.SakallioğluS.AkdenizF.StyanG. P. H.WernerH. J.A new biased estimator in linear regression and a detailed analysis of the widely-analysed dataset on Portland cementAkdenizF.ErolH.Mean squared error matrix comparisons of some biased estimators in linear regressionHubertM. H.WijekoonP.Improvement of the Liu estimator in linear regression modelBayeM. R.ParkerD. F.Combining ridge and principal component regression: a money demand illustrationKaçıranlarS.SakallıoğluS.Combining the Liu estimator and the principal component regression estimatorXuJ.YangH.On the restricted r-k class estimator and the restricted r-d class estimator in linear regressionWuJ. B.YangH.On the stochastic restricted almost unbiased estimators in linear regression modelBaksalaryJ. K.TrenklerG.Nonnegative and positive definiteness of matrices modified by two matrices of rank oneSarkarN.Mean square error matrix comparison of some estimators in linear regressions with multicollinearityLiY.YangH.A new stochastic mixed ridge estimator in linear regression model