guess_best_k-ConsensusPartition-method {cola}R Documentation

Guess the best number of partitions

Description

Guess the best number of partitions

Usage

## S4 method for signature 'ConsensusPartition'
guess_best_k(object, rand_index_cutoff = 0.95)

Arguments

object

a ConsensusPartition-class object.

rand_index_cutoff

the Rand index compared to previous k is larger than this value, it is filtered out.

Details

The best k is voted from 1) the k with the maximal cophcor value, 2) the k with the minimal PAC value, 3) the k with the maximal mean silhouette value and 4) the k with the maximal concordance value.

There are scenarios that a better partition with k groups than k - 1 groups (e.g. for the sense of better sihouette score) is only because of one tiny group of samples are separated and it is better to still put them back to the original group to improve the robustness of the subgrouping. For this, users can set the cutoff of Rand index by rand_index_cutoff to get rid of or reduce the effect of such cirsumstances.

Honestly, it is hard or maybe impossible to say which k is the best one. guess_best_k function only gives suggestion of selecting a reasonable k. Users still need to look at the plots (e.g. by select_partition_number or consensus_heatmap functions), or even by checking whether the subgrouping gives a reasonable signatures by get_signatures, to pick a reasonable k that best explains their study.

Value

The best k.

Author(s)

Zuguang Gu <z.gu@dkfz.de>

Examples

data(cola_rl)
obj = cola_rl["sd", "kmeans"]
guess_best_k(obj)

[Package cola version 1.0.0 Index]