bpFitGrid {scPCA} | R Documentation |
Identify the Optimal Contrastive and Penalty Parameters in Parallel
Description
This function is used to automatically select the optimal
contrastive parameter and L1 penalty term for scPCA based on a clustering
algorithm and average silhouette width. Analogous to fitGrid
,
but replaces all lapply
calls by
bplapply
.
Usage
bpFitGrid(
target,
target_valid = NULL,
center,
scale,
c_contrasts,
contrasts,
penalties,
n_eigen,
alg,
clust_method = c("kmeans", "pam", "hclust"),
n_centers,
max_iter = 10,
linkage_method = "complete",
clusters = NULL,
eigdecomp_tol = 1e-10,
eigdecomp_iter = 1000
)
Arguments
target |
The target (experimental) data set, in a standard format such
as a data.frame or matrix .
|
target_valid |
A holdout set of the target (experimental) data set, in a
standard format such as a data.frame or matrix . NULL by
default but used by cvSelectParams for cross-validated
selection of the contrastive and penalization parameters.
|
center |
A logical indicating whether the target and background
data sets should be centered to mean zero.
|
scale |
A logical indicating whether the target and background
data sets should be scaled to unit variance.
|
c_contrasts |
A list of contrastive covariances.
|
contrasts |
A numeric vector of the contrastive parameters used
to compute the contrastive covariances.
|
penalties |
A numeric vector of the penalty terms.
|
n_eigen |
A numeric indicating the number of eigenvectors to be
computed.
|
alg |
A character indicating the SPCA algorithm used to sparsify
the contrastive loadings. Currently supports iterative for the
Zou et al. (2006) implemententation, var_proj
for the non-randomized Erichson et al. (2018)
solution, and rand_var_proj fir the randomized
Erichson et al. (2018) result.
|
clust_method |
A character specifying the clustering method to
use for choosing the optimal constrastive parameter. Currently, this is
limited to either k-means, partitioning around medoids (PAM), and
hierarchical clustering. The default is k-means clustering.
|
n_centers |
A numeric giving the number of centers to use in the
clustering algorithm.
|
max_iter |
A numeric giving the maximum number of iterations to
be used in k-means clustering, defaulting to 10.
|
linkage_method |
A character specifying the agglomerative linkage
method to be used if clust_method = "hclust" . The options are
ward.D2 , single , complete ,
average , mcquitty , median , and centroid . The
default is complete .
|
clusters |
A numeric vector of cluster labels for observations in
the target data. Defaults to NULL , but is otherwise used to
identify the optimal set of hyperparameters when fitting the scPCA and the
automated version of cPCA.
|
eigdecomp_tol |
A numeric providing the level of precision used by
eigendecompositon calculations. Defaults to 1e-10 .
|
eigdecomp_iter |
A numeric indicating the maximum number of
interations performed by eigendecompositon calculations. Defaults to
1000 .
|
Value
A list similar to that output by prcomp
:
rotation - the matrix of variable loadings
x - the rotated data, centred and scaled, if requested, data
multiplied by the rotation matrix
contrast - the optimal contrastive parameter
penalty - the optimal L1 penalty term
References
Erichson NB, Zeng P, Manohar K, Brunton SL, Kutz JN, Aravkin AY (2018).
“Sparse Principal Component Analysis via Variable Projection.”
ArXiv, abs/1804.00341.
Zou H, Hastie T, Tibshirani R (2006).
“Sparse principal component analysis.”
Journal of computational and graphical statistics, 15(2), 265–286.
[Package
scPCA version 1.6.2
Index]