quality {evaluomeR}R Documentation

Goodness of classifications.

Description

The goodness of the classifications are assessed by validating the clusters generated. For this purpose, we use the Silhouette width as validity index. This index computes and compares the quality of the clustering outputs found by the different metrics, thus enabling to measure the goodness of the classification for both instances and metrics. More precisely, this goodness measurement provides an assessment of how similar an instance is to other instances from the same cluster and dissimilar to all the other clusters. The average on all the instances quantifies how appropriately the instances are clustered. Kaufman and Rousseeuw suggested the interpretation of the global Silhouette width score as the effectiveness of the clustering structure. The values are in the range [0,1], having the following meaning:

Usage

quality(data, k = 5, getImages = TRUE)

Arguments

data

A SummarizedExperiment. The SummarizedExperiment must contain an assay with the following structure: A valid header with names. The first column of the header is the ID or name of the instance of the dataset (e.g., ontology, pathway, etc.) on which the metrics are measured. The other columns of the header contains the names of the metrics. The rows contains the measurements of the metrics for each instance in the dataset.

k

Positive integer. Number of clusters between [2,15] range.

getImages

Boolean. If true, a plot is displayed.

Value

A SummarizedExperiment containing the silhouette width measurements and cluster sizes for cluster k.

References

Kaufman L, Rousseeuw PJ (2009). Finding groups in data: an introduction to cluster analysis, volume 344. John Wiley & Sons.

Examples

# Using example data from our package
data("ontMetrics")
result = quality(ontMetrics, k=4)


[Package evaluomeR version 1.0.0 Index]