guess_best_k-ConsensusPartition-method {cola} | R Documentation |
Guess the best number of partitions
## S4 method for signature 'ConsensusPartition' guess_best_k(object, rand_index_cutoff = 0.95)
object |
a |
rand_index_cutoff |
the Rand index compared to previous k is larger than this value, it is filtered out. |
The best k is voted from 1) the k with the maximal cophcor value, 2) the k with the minimal PAC value, 3) the k with the maximal mean silhouette value and 4) the k with the maximal concordance value.
There are scenarios that a better partition with k groups than k - 1 groups (e.g. for the sense of better sihouette score)
is only because of one tiny group of samples are separated and it is better to still put them back to the original group
to improve the robustness of the subgrouping. For this, users can set the cutoff of Rand index by rand_index_cutoff
to
get rid of or reduce the effect of such cirsumstances.
Honestly, it is hard or maybe impossible to say which k is the best one. guess_best_k
function only gives suggestion of selecting
a reasonable k. Users still need to look at the plots (e.g. by select_partition_number
or consensus_heatmap
functions), or even
by checking whether the subgrouping gives a reasonable signatures by get_signatures
, to pick a reasonable k that best explains their study.
The best k.
Zuguang Gu <z.gu@dkfz.de>
data(cola_rl) obj = cola_rl["sd", "kmeans"] guess_best_k(obj)