covMcd {rrcov} | R Documentation |
Compute a multivariate location and scale estimate with a high breakdown point using the Fast MCD (Minimum Covariance Determinant) Estimator.
covMcd(x, cor=FALSE, alpha=1/2, nsamp=500, seed=0, print.it=FALSE)
x |
a matrix or data frame. |
cor |
should the returned result include a correlation matrix? Default is cor = FALSE |
alpha |
The size of the subsets over which the determinant is minimized. Must be between the default = (n+p+1)/2 and n. Provide a fraction between .5 and 1, indicating the fraction of the data over which the determinant is minimized. |
nsamp |
number of subsets used for initial estimates. Default is nsamp = 500 |
seed |
starting value for random generator. Default is seed = 0 |
print.it |
whether to print intermediate results. Default is print.it = FALSE |
The minimum covariance determinant estimator of location and scatter implemented in covMcd() is similar to the existing R function cov.mcd() in MASS. The MCD method looks for the h(> n/2) observations (out of n) whose classical covariance matrix has the lowest possible determinant. The raw MCD estimate of location is then the average of these h points, whereas the raw MCD estimate of scatter is their covariance matrix, multiplied with a consistency factor. Based on these raw MCD estimates, a reweighting step is performed which increases the finite-sample eficiency considerably - see Pison et.al. (2002). The implementation in rrcov uses the Fast MCD algorithm of Rousseeuw and Van Driessen (1999) to approximate the minimum covariance determinant estimator.
A list with components
center |
the final estimate of location. |
cov |
the final estimate of scatter. |
cor |
the (final) estimate of the correlation matrix (only if cor = TRUE ) .
|
crit |
the value of the criterion, i.e. the determinant. |
best |
the best subset found and used for computing the raw estimates. The size of best is equal to quan .
|
mah |
mahalanobis distances of the observations using the final estimate of the location and scater. |
mcd.wt |
weights of the observations using the final estimate of the location and scater. |
raw.center |
the raw (not reweighted) estimate of location. |
raw.cov |
the raw (not reweighted) estimate of scatter. |
raw.mah |
mahalanobis distances of the observations based on the raw estimate of the location and scater. |
raw.weights |
weights of the observations based on the raw estimate of the location and scater. |
X |
the input data as a matrix. |
n.obs |
total number of observations. |
alpha |
the size of the subsets over which the determinant is minimized (the default is (n+p+1)/2). |
quan |
the number of observations on which the MCD is based.
If quan equals n.obs , the MCD is the classical covariance matrix.
|
method |
character string naming the method (Minimum Covariance Determinant). |
P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection. Wiley.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
Pison, G., Van Aelst, S., and Willems, G. (2002), Small Sample Corrections for LTS and MCD, Metrika, 55, 111-123.
data(hbk) covMcd(hbk.x)