princomp {stats} | R Documentation |

`princomp`

performs a principal components analysis on the given
numeric data matrix and returns the results as an object of class
`princomp`

.

princomp(x, ...) ## S3 method for class 'formula': princomp(formula, data = NULL, subset, na.action, ...) ## Default S3 method: princomp(x, cor = FALSE, scores = TRUE, covmat = NULL, subset = rep(TRUE, nrow(as.matrix(x))), ...) ## S3 method for class 'princomp': predict(object, newdata, ...)

`formula` |
a formula with no response variable. |

`data` |
an optional data frame containing the variables in the
formula `formula` . By default the variables are taken from
`environment(formula)` . |

`x` |
a numeric matrix or data frame which provides the data for the principal components analysis. |

`subset` |
an optional vector used to select rows (observations) of the
data matrix `x` . |

`na.action` |
a function which indicates what should happen
when the data contain `NA` s. The default is set by
the `na.action` setting of `options` , and is
`na.fail` if that is unset. The “factory-fresh”
default is `na.omit` . |

`cor` |
a logical value indicating whether the calculation should use the correlation matrix or the covariance matrix. |

`scores` |
a logical value indicating whether the score on each principal component should be calculated. |

`covmat` |
a covariance matrix, or a covariance list as returned by
`cov.wt` , `cov.mve` or
`cov.mcd` .
If supplied, this is used rather than the covariance matrix of
`x` . |

`...` |
arguments passed to or from other methods. If `x` is
a formula one might specify `cor` or `scores` . |

`object` |
Object of class inheriting from `"princomp"` |

`newdata` |
Data frame in which to predict |

`princomp`

is a generic function with `"formula"`

and
`"default"`

methods.

The calculation is done using `eigen`

on the correlation or
covariance matrix, as determined by `cor`

. This is done for
compatibility with the S-PLUS result. A preferred method of
calculation is to use `svd`

on `x`

, as is done in
`prcomp`

.

Note that the default calculation uses divisor `N`

for the
covariance matrix.

The `print`

method for the these objects prints the
results in a nice format and the `plot`

method produces
a scree plot (`screeplot`

). There is also a
`biplot`

method.

If `x`

is a formula then the standard NA-handling is applied to
the scores (if requested): see `napredict`

.

`princomp`

only handles so-called R-mode PCA, that is feature
extraction of variables. If a data matrix is supplied (possibly via a
formula) it is required that there are at least as many units as
variables. For Q-mode PCA use `prcomp`

.

`princomp`

returns a list with class `"princomp"`

containing the following components:

`sdev` |
the standard deviations of the principal components. |

`loadings` |
the matrix of variable loadings (i.e., a matrix
whose columns contain the eigenvectors). This is of class
`"loadings"` : see `loadings` for its `print`
method. |

`center` |
the means that were subtracted. |

`scale` |
the scalings applied to each variable. |

`n.obs` |
the number of observations. |

`scores` |
if `scores = TRUE` , the scores of the supplied
data on the principal components. These are non-null only if
`x` was supplied, and if `covmat` was also supplied if it
was a covariance list. |

`call` |
the matched call. |

`na.action` |
If relevant. |

The signs of the columns of the loadings and scores are arbitrary, and
so may differ between different programs for PCA, and even between
different builds of **R**.

Mardia, K. V., J. T. Kent and J. M. Bibby (1979).
*Multivariate Analysis*, London: Academic Press.

Venables, W. N. and B. D. Ripley (2002).
*Modern Applied Statistics with S*, Springer-Verlag.

`summary.princomp`

, `screeplot`

,
`biplot.princomp`

,
`prcomp`

, `cor`

, `cov`

,
`eigen`

.

## The variances of the variables in the ## USArrests data vary by orders of magnitude, so scaling is appropriate data(USArrests) (pc.cr <- princomp(USArrests)) # inappropriate princomp(USArrests, cor = TRUE) # =^= prcomp(USArrests, scale=TRUE) ## Similar, but different: ## The standard deviations differ by a factor of sqrt(49/50) summary(pc.cr <- princomp(USArrests, cor = TRUE)) loadings(pc.cr) ## note that blank entries are small but not zero plot(pc.cr) # shows a screeplot. biplot(pc.cr) ## Formula interface princomp(~ ., data = USArrests, cor = TRUE) # NA-handling USArrests[1, 2] <- NA pc.cr <- princomp(~ Murder + Assault + UrbanPop, data = USArrests, na.action=na.exclude, cor = TRUE) pc.cr$scores