BASiCS_DetectHVG {BASiCS} | R Documentation |
Functions to detect highly and lowly variable genes. If the BASiCS_Chain object was generated using the regression approach, BASiCS finds the top highly variable genes based on the posteriors of the epsilon parameters. Otherwise, the old approach is used, which initially performs a variance decomposition.
BASiCS_DetectHVG(Chain, PercentileThreshold = 0.9, VarThreshold = NULL, ProbThreshold = NULL, EFDR = 0.1, OrderVariable = "Prob", Plot = FALSE, ...) BASiCS_DetectLVG(Chain, PercentileThreshold = 0.1, VarThreshold = NULL, ProbThreshold = NULL, EFDR = 0.1, OrderVariable = "Prob", Plot = FALSE, ...)
Chain |
an object of class |
PercentileThreshold |
Threshold to detect a percentile of variable genes (must be a positive value, between 0 and 1). Defaults: 0.9 for HVG (top 10 percent), 0.1 for LVG (bottom 10 percent) |
VarThreshold |
Variance contribution threshold (must be a positive value, between 0 and 1). This is only used when the BASiCS non-regression model was used to generate the Chain object. |
ProbThreshold |
Optional parameter. Posterior probability threshold (must be a positive value, between 0 and 1) |
EFDR |
Target for expected false discovery rate related to HVG/LVG detection (default = 0.10) |
OrderVariable |
Ordering variable for output.
Possible values: |
Plot |
If |
... |
Graphical parameters (see |
See vignette
BASiCS_DetectHVG
returns a list of 4 elements:
Table
Matrix whose columns can contain
GeneIndex
Vector of length q.bio
.
Gene index as in the order present in the analysed
SingleCellExperiment
GeneName
Vector of length q.bio
.
Gene name as in the order present in the analysed
SingleCellExperiment
Mu
Vector of length q.bio
. For each biological gene,
posterior median of gene-specific mean expression
parameters μ_i
Delta
Vector of length q.bio
. For each biological
gene, posterior median of gene-specific biological
over-dispersion parameter δ_i
Sigma
Vector of length q.bio
.
For each biological gene, proportion of the total variability
that is due to a biological heterogeneity component.
Epsilon
Vector of length q.bio
.
For each biological gene, posterior median of gene-specific residual
over-dispersion parameter ε_i.
Prob
Vector of length q.bio
.
For each biological gene, probability of being highly variable
according to the given thresholds.
HVG
Vector of length q.bio
.
For each biological gene, indicator of being detected as highly
variable according to the given thresholds.
LVG
Vector of length q.bio
.
For each biological gene, indicator of being detected as lowly
variable according to the given thresholds.
ProbThreshold
Posterior probability threshold.
EFDR
Expected false discovery rate for the given thresholds.
EFNR
Expected false negative rate for the given thresholds.
Catalina A. Vallejos cnvallej@uc.cl
Nils Eling eling@ebi.ac.uk
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
# Loads short example chain (non-regression implementation) data(ChainSC) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSC, VarThreshold = 0.60, EFDR = 0.10, Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSC, VarThreshold = 0.40, EFDR = 0.10, Plot = TRUE) # Loads short example chain (regression implementation) data(ChainSCReg) # Highly and lowly variable genes detection (within a single group of cells) DetectHVG <- BASiCS_DetectHVG(ChainSCReg, PercentileThreshold = 0.90, EFDR = 0.10, Plot = TRUE) DetectLVG <- BASiCS_DetectLVG(ChainSCReg, PercentileThreshold = 0.10, EFDR = 0.10, Plot = TRUE)