3.2.11 Principal Component Analysis

The overloaded prcomp and princomp functions perform principal component analysis in parallel in the database.

The prcomp function uses a singular value decomposition of the covariance and correlations between variables. The princomp function uses eigen decomposition of the covariance and correlations between samples.

The transparency layer methods ore.frame-prcomp and ore.frame-princomp enable you to use the generic functions prcomp and princomp on data in an ore.frame object. This allows the functions to execute in parallel processes in the database.

For both functions, the methods support the function signature that accepts an ore.frame as the x argument and the signature that accepts a formula. The ore.frame must contain only numeric data. The formula must refer only to numeric variables and have no response variable.

Function prcomp returns a prcomp object and function princomp returns a princomp object.

For details about the function arguments, invoke help('ore.frame-prcomp') and help('ore.frame-princomp').

Note:

The biplot function is not supported for the objects returned by these transparency layer methods.

Example 3-63 Using the prcomp and princomp Functions

USARRESTS <- ore.push(USArrests)

# Using prcomp

prcomp(USARRESTS)
prcomp(USARRESTS, scale. = TRUE)

# Formula interface
prcomp(~ Murder + Assault + UrbanPop, data = USARRESTS, scale. = TRUE)

# Using prcomp

princomp(USARRESTS)
princomp(USARRESTS, cor = TRUE)

# Formula interface
princomp(~ Murder + Assault + UrbanPop, data = USARRESTS, cor = TRUE)

Listing for Example 3-63

R> USARRESTS <- ore.push(USArrests)
R> 
R> # Using prcomp
R>
R> prcomp(USARRESTS)
Standard deviations:
[1] 83.732400 14.212402  6.489426  2.482790

Rotation:
                PC1         PC2         PC3         PC4
Murder   0.04170432 -0.04482166  0.07989066 -0.99492173
Assault  0.99522128 -0.05876003 -0.06756974  0.03893830
UrbanPop 0.04633575  0.97685748 -0.20054629 -0.05816914
Rape     0.07515550  0.20071807  0.97408059  0.07232502

R> prcomp(USARRESTS, scale. = TRUE)
Standard deviations:
[1] 1.5748783 0.9948694 0.5971291 0.4164494

Rotation:
               PC1        PC2        PC3         PC4
Murder   0.5358995 -0.4181809  0.3412327  0.64922780
Assault  0.5831836 -0.1879856  0.2681484 -0.74340748
UrbanPop 0.2781909  0.8728062  0.3780158  0.13387773
Rape     0.5434321  0.1673186 -0.8177779  0.08902432
R> 
R> # Formula interface
R> prcomp(~ Murder + Assault + UrbanPop, data = USARRESTS, scale. = TRUE)
Standard deviations:
[1] 1.3656547 0.9795415 0.4189100

Rotation:
               PC1         PC2        PC3
Murder   0.6672955 -0.30345520  0.6801703
Assault  0.6970818 -0.06713997 -0.7138411
UrbanPop 0.2622854  0.95047734  0.1667309
R>
R> # Using princomp
R>
R> princomp(USARRESTS)
Call:
princomp(USARRESTS)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4 
82.890847 14.069560  6.424204  2.457837 

 4  variables and  50 observations.
R> princomp(USARRESTS, cor = TRUE)
Call:
princomp(USARRESTS, cor = TRUE)

Standard deviations:
   Comp.1    Comp.2    Comp.3    Comp.4 
1.5748783 0.9948694 0.5971291 0.4164494 

 4  variables and  50 observations.
R> 
R> # Formula interface
R> princomp(~ Murder + Assault + UrbanPop, data = USARRESTS, cor = TRUE)
Call:
princomp(~Murder + Assault + UrbanPop, data = USARRESTS, cor = TRUE)

Standard deviations:
   Comp.1    Comp.2    Comp.3 
1.3656547 0.9795415 0.4189100 

 3  variables and  50 observations.