mclustDA {mclust}R Documentation

MclustDA discriminant analysis.

Description

MclustDA training and testing.

Usage

mclustDA(trainingData, labels, testData, G=1:6, verbose = FALSE)

Arguments

trainingData A numeric vector, matrix, or data frame of training observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables.
labels A numeric or character vector assigning a class label to each training observation.
testData A numeric vector, matrix, or data frame of training observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables.
G An integer vector specifying the numbers of mixture components (clusters) to be considered for each class. Default: 1:6.
verbose A logical variable telling whether or not to print an indication that the function is in the training phase, which may take some time to complete.

Value

A list with the following components:

testClassification mclustDA classification of the test data.
trainingClassification mclustDA classification of the training data.
VofIindex Meila's Variation of Information index, to compare classification of the training data to the known labels.
summary Gives the best model and number of clusters for each training class.
models The mixture models used to fit the known classes.
postProb A matrix whose [i,k]th entry is the probability that observation i in the test data belongs to the kth class.

Details

The following models are compared in Mclust:

"E" for spherical, equal variance (one-dimensional)
"V" for spherical, variable variance (one-dimensional)

"EII": spherical, equal volume
"VII": spherical, unequal volume
"EEI": diagonal, equal volume, equal shape
"VVI": diagonal, varying volume, varying shape
"EEE": ellipsoidal, equal volume, shape, and orientation
"VVV": ellipsoidal, varying volume, shape, and orientation

mclustDA is a simplified function combining mclustDAtrain and mclustDAtest and their summaries.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631. See http://www.stat.washington.edu/mclust.

C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.

M. Meila (2002). Comparing clusterings. Technical Report 418, Department of Statistics, University of Washington. See http://www.stat.washington.edu/www/research/reports.

See Also

plot.mclustDA, mclustDAtrain, mclustDAtest, compareClass, classError

Examples

n <- 250 ## create artificial data
set.seed(0)
x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)),
           matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])
xclass <- c(rep(1,n),rep(2,n))

## Not run: 
par(pty = "s")
mclust2Dplot(x, classification = xclass, type="classification", ask=FALSE)
## End(Not run)

odd <- seq(from = 1, to = 2*n, by = 2)
even <- odd + 1
testMclustDA <- mclustDA(trainingData = x[odd, ], labels = xclass[odd], 
                         testData = x[even,])

clEven <- testMclustDA$testClassification ## classify training set
compareClass(clEven,xclass[even])
## Not run: 
plot(testMclustDA, trainingData = x[odd, ], labels = xclass[odd], 
              testData = x[even,])
## End(Not run)

[Package mclust version 2.1-11 Index]