Principle Component Analysis (PCA)

Description

Principal Component Analysis (PCA) is a multivariate analysis technique to transform a set of variables into a smaller set of uncorrelated components that accounts for the maximum variance in the data set. The calculation in R is done by a singular value decomposition (SVD) of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy.

An explanation of the fundamentals of Principle Component Analysis can be found here

Inputs

PCA

[Click to Enlarge]

Name – the name of your PCA analysis

PCA Dataset Input – select a dataset that contains the variables of interest.

Principal Component Analysis Variables – The set of independent variables that you would like to analyse

PCA Scale – a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. The default is FALSE, but in general scaling is advisable.

PCA RetX – a logical value indicating whether the rotated variables should be returned. X is a numeric or complex matrix (or data frame) which provides the data for the principal components analysis.

[ Tools > Statistical Analysis > PCA > enter parameters > Add Tools > Show/Hide > Execute]

Outputs

The PCA output produces a text box (below) containing outputs from the the PCA. These are described below.

 

PCA_Outputs

[Click to Enlarge]

Sdev: Standard Deviation

Rotation Matrix : the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors).

Center : the means that were subtracted.

Scale : the scalings applied to each variable. Scale is not shown if PCA Scale is not enabled.

Predict PCA X : a numeric matrix or data frame which provides the data for the principal components analysis.

Standard deviation : the standard deviations of the principal components

Proportion of Variance : Percent variance explained individually

Cumulative Proportion : percent variance explained cumulatively

Advanced

Because of the complexity of PCA formula, no formula is provided in this section. Please refer to the following references for the PCA.

References

  1. An Introduction to R 2.14.1 (2011-12-22).
  2. Dunteman, G. H. (1989) Principal Components Analysis, SAGE Publications, Inc, Newbury Park.
  3. Jolliffe, I. T. (2002) Principal Components Analysis (Second Edition), Springer, New York.
  4. Mardia, K. V., J. T. Kent and J. M. Bibby (1979) Multivariate Analysis, Academic Press, London.
  5. WIKIPEDIA (http://en.wikipedia.org/wiki/Singular_value_decomposition)