mclustDA {mclust} | R Documentation |
MclustDA training and testing.
mclustDA(trainingData, labels, testData, G=1:6, verbose = FALSE)
trainingData |
A numeric vector, matrix, or data frame of training observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables. |
labels |
A numeric or character vector assigning a class label to each training observation. |
testData |
A numeric vector, matrix, or data frame of training observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables. |
G |
An integer vector specifying the numbers of mixture components (clusters)
to be considered for each class.
Default: 1:6 .
|
verbose |
A logical variable telling whether or not to print an indication that the function is in the training phase, which may take some time to complete. |
A list with the following components:
testClassification |
mclustDA classification of the test data.
|
trainingClassification |
mclustDA classification of the training data.
|
VofIindex |
Meila's Variation of Information index, to compare classification of the training data to the known labels. |
summary |
Gives the best model and number of clusters for each training class. |
models |
The mixture models used to fit the known classes. |
postProb |
A matrix whose [i,k]th entry is the probability that observation i in the test data belongs to the kth class. |
The following models are compared in Mclust
:
"E" for spherical, equal variance (one-dimensional)
"V" for spherical, variable variance (one-dimensional)
"EII": spherical, equal volume
"VII": spherical, unequal volume
"EEI": diagonal, equal volume, equal shape
"VVI": diagonal, varying volume, varying shape
"EEE": ellipsoidal, equal volume, shape, and orientation
"VVV": ellipsoidal, varying volume, shape, and orientation
mclustDA
is a simplified function combining
mclustDAtrain
and mclustDAtest
and their summaries.
C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631. See http://www.stat.washington.edu/mclust.
C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.
M. Meila (2002). Comparing clusterings. Technical Report 418, Department of Statistics, University of Washington. See http://www.stat.washington.edu/www/research/reports.
plot.mclustDA
,
mclustDAtrain
,
mclustDAtest
,
compareClass
,
classError
n <- 250 ## create artificial data set.seed(0) x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)), matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1]) xclass <- c(rep(1,n),rep(2,n)) ## Not run: par(pty = "s") mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE) ## End(Not run) odd <- seq(from = 1, to = 2*n, by = 2) even <- odd + 1 testMclustDA <- mclustDA(trainingData = x[odd, ], labels = xclass[odd], testData = x[even,]) clEven <- testMclustDA$testClassification ## classify training set compareClass(clEven,xclass[even]) ## Not run: plot(testMclustDA, trainingData = x[odd, ], labels = xclass[odd], testData = x[even,]) ## End(Not run)