EMclustN {mclust} | R Documentation |
BIC for EM initialized by hierarchical clustering for parameterized Gaussian mixture models with Poisson noise.
EMclustN(data, G, emModelNames, noise, hcPairs, eps, tol, itmax, equalPro, warnSingular=FALSE, Vinv, ...)
data |
A numeric vector, matrix, or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables. |
G |
An integer vector specifying the numbers of MVN (Gaussian) mixture
components (clusters) for which the BIC is to be calculated. The
default is 0:9 where 0 indicates only a noise
component.
|
emModelNames |
A vector of character strings indicating the models to be fitted
in the EM phase of clustering. Possible models: "E" for spherical, equal variance (one-dimensional) "V" for spherical, variable variance (one-dimensional) "EII": spherical, equal volume "VII": spherical, unequal volume "EEI": diagonal, equal volume, equal shape "VEI": diagonal, varying volume, equal shape "EVI": diagonal, equal volume, varying shape "VVI": diagonal, varying volume, varying shape "EEE": ellipsoidal, equal volume, shape, and orientation "EEV": ellipsoidal, equal volume and equal shape "VEV": ellipsoidal, equal shape "VVV": ellipsoidal, varying volume, shape, and orientation The default is .Mclust\$emModelNames .
|
noise |
A logical or numeric vector indicating whether or not observations
are initially estimated to noise in the data. If there is no noise
EMclust should be use rather than EMclustN .
|
hcPairs |
A matrix of merge pairs for hierarchical clustering such as produced
by function hc . The default is to compute a hierarchical
clustering tree by applying function hc with
modelName = .Mclust\$hcModelName[1] to univariate data and
modelName = .Mclust\$hcModelName[2] to multivariate data or a
subset as indicated by the subset argument. The hierarchical
clustering results are used as starting values for EM.
|
eps |
A scalar tolerance for deciding when to terminate computations due to
computational singularity in covariances. Smaller values of
eps allow computations to proceed nearer to singularity. The
default is .Mclust\$eps .
|
tol |
A scalar tolerance for relative convergence of the loglikelihood.
The default is .Mclust\$tol .
|
itmax |
An integer limit on the number of EM iterations.
The default is .Mclust\$itmax .
|
equalPro |
Logical variable indicating whether or not the mixing proportions are
equal in the model. The default is .Mclust\$equalPro .
|
Vinv |
An estimate of the reciprocal hypervolume of the data region.
The default is determined by applying function
hypvol to the data.
|
warnSingular |
A logical value indicating whether or not a warning should be issued
whenever a singularity is encountered.
The default is warnSingular=FALSE .
|
... |
Provided to allow lists with elements other than the arguments can
be passed in indirect or list calls with do.call .
|
Bayesian Information Criterion for the specified mixture models numbers of clusters. Auxiliary information returned as attributes.
C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631. See http://www.stat.washington.edu/mclust.
C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.
summary.EMclustN
,
EMclust
,
hc
,
me
,
mclustOptions
data(iris) irisMatrix <- as.matrix(iris[,1:4]) irisClass <- iris[,5] b <- apply( irisMatrix, 2, range) n <- 450 set.seed(0) poissonNoise <- apply(b, 2, function(x, n=n) runif(n, min = x[1]-0.1, max = x[2]+.1), n = n) set.seed(0) noiseInit <- sample(c(TRUE,FALSE),size=150+450,replace=TRUE,prob=c(3,1)) Bic <- EMclustN(data=rbind(irisMatrix, poissonNoise), noise = noiseInit) Bic plot(Bic)