EMclust {mclust} | R Documentation |
BIC for EM initialized by hierarchical clustering for parameterized Gaussian mixture models.
EMclust(data, G, emModelNames, hcPairs, subset, eps, tol, itmax, equalPro, warnSingular, ...)
data |
A numeric vector, matrix, or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables. |
G |
An integer vector specifying the numbers of mixture components
(clusters) for which the BIC is to be calculated. The default is
1:9 .
|
emModelNames |
A vector of character strings indicating the models to be fitted
in the EM phase of clustering. Possible models: "E" for spherical, equal variance (one-dimensional) "V" for spherical, variable variance (one-dimensional) "EII": spherical, equal volume "VII": spherical, unequal volume "EEI": diagonal, equal volume, equal shape "VEI": diagonal, varying volume, equal shape "EVI": diagonal, equal volume, varying shape "VVI": diagonal, varying volume, varying shape "EEE": ellipsoidal, equal volume, shape, and orientation "EEV": ellipsoidal, equal volume and equal shape "VEV": ellipsoidal, equal shape "VVV": ellipsoidal, varying volume, shape, and orientation The default is .Mclust\$emModelNames .
|
hcPairs |
A matrix of merge pairs for hierarchical clustering such as produced
by function hc . The default is to compute a hierarchical
clustering tree by applying function hc with
modelName = .Mclust\$hcModelName[1] to univariate data and
modelName = .Mclust\$hcModelName[2] to multivariate data or a
subset as indicated by the subset argument. The hierarchical
clustering results are used as starting values for EM.
|
subset |
A logical or numeric vector specifying the indices of a subset of the data to be used in the initial hierarchical clustering phase. |
eps |
A scalar tolerance for deciding when to terminate computations due
to computational singularity in covariances. Smaller values of
eps allow computations to proceed nearer to singularity. The
default is .Mclust\$eps .
|
tol |
A scalar tolerance for relative convergence of the loglikelihood.
The default is .Mclust\$tol .
|
itmax |
An integer limit on the number of EM iterations.
The default is .Mclust\$itmax .
|
equalPro |
Logical variable indicating whether or not the mixing proportions are
equal in the model. The default is .Mclust\$equalPro .
|
warnSingular |
A logical value indicating whether or not a warning should be issued
whenever a singularity is encountered.
The default is warnSingular=FALSE .
|
... |
Provided to allow lists with elements other than the arguments can
be passed in indirect or list calls with do.call .
|
Bayesian Information Criterion for the specified mixture models numbers of clusters. Auxiliary information returned as attributes.
C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611:631. See http://www.stat.washington.edu/mclust.
C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.
summary.EMclust
,
EMclustN
,
hc
,
me
,
mclustOptions
data(iris) irisMatrix <- as.matrix(iris[,1:4]) irisBic <- EMclust(irisMatrix) irisBic plot(irisBic) irisBic <- EMclust(irisMatrix, subset = sample(1:nrow(irisMatrix), 100)) irisBic plot(irisBic)