EMclustN {mclust}R Documentation

BIC for Model-Based Clustering with Poisson Noise

Description

BIC for EM initialized by hierarchical clustering for parameterized Gaussian mixture models with Poisson noise.

Usage

EMclustN(data, G, emModelNames, noise, hcPairs, eps, tol, itmax, 
         equalPro, warnSingular=FALSE, Vinv, ...)

Arguments

data A numeric vector, matrix, or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables.
G An integer vector specifying the numbers of MVN (Gaussian) mixture components (clusters) for which the BIC is to be calculated. The default is 0:9 where 0 indicates only a noise component.
emModelNames A vector of character strings indicating the models to be fitted in the EM phase of clustering. Possible models:

"E" for spherical, equal variance (one-dimensional)
"V" for spherical, variable variance (one-dimensional)
"EII": spherical, equal volume
"VII": spherical, unequal volume
"EEI": diagonal, equal volume, equal shape
"VEI": diagonal, varying volume, equal shape
"EVI": diagonal, equal volume, varying shape
"VVI": diagonal, varying volume, varying shape
"EEE": ellipsoidal, equal volume, shape, and orientation
"EEV": ellipsoidal, equal volume and equal shape
"VEV": ellipsoidal, equal shape
"VVV": ellipsoidal, varying volume, shape, and orientation

The default is .Mclust\$emModelNames.
noise A logical or numeric vector indicating whether or not observations are initially estimated to noise in the data. If there is no noise EMclust should be use rather than EMclustN.
hcPairs A matrix of merge pairs for hierarchical clustering such as produced by function hc. The default is to compute a hierarchical clustering tree by applying function hc with modelName = .Mclust\$hcModelName[1] to univariate data and modelName = .Mclust\$hcModelName[2] to multivariate data or a subset as indicated by the subset argument. The hierarchical clustering results are used as starting values for EM.
eps A scalar tolerance for deciding when to terminate computations due to computational singularity in covariances. Smaller values of eps allow computations to proceed nearer to singularity. The default is .Mclust\$eps.
tol A scalar tolerance for relative convergence of the loglikelihood. The default is .Mclust\$tol.
itmax An integer limit on the number of EM iterations. The default is .Mclust\$itmax.
equalPro Logical variable indicating whether or not the mixing proportions are equal in the model. The default is .Mclust\$equalPro.
Vinv An estimate of the reciprocal hypervolume of the data region. The default is determined by applying function hypvol to the data.
warnSingular A logical value indicating whether or not a warning should be issued whenever a singularity is encountered. The default is warnSingular=FALSE.
... Provided to allow lists with elements other than the arguments can be passed in indirect or list calls with do.call.

Value

Bayesian Information Criterion for the specified mixture models numbers of clusters. Auxiliary information returned as attributes.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631. See http://www.stat.washington.edu/mclust.

C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.

See Also

summary.EMclustN, EMclust, hc, me, mclustOptions

Examples

data(iris)
irisMatrix <- as.matrix(iris[,1:4])
irisClass <- iris[,5]

b <- apply( irisMatrix, 2, range)
n <- 450
set.seed(0)
poissonNoise <- apply(b, 2, function(x, n=n) 
                      runif(n, min = x[1]-0.1, max = x[2]+.1), n = n)
set.seed(0)
noiseInit <- sample(c(TRUE,FALSE),size=150+450,replace=TRUE,prob=c(3,1))
Bic <-  EMclustN(data=rbind(irisMatrix, poissonNoise), noise = noiseInit)
Bic
plot(Bic)

[Package mclust version 2.1-11 Index]