| SingleR {SingleR} | R Documentation |
Returns the best annotation for each cell in a test dataset, given a labelled reference dataset in the same feature space.
SingleR(
test,
ref,
labels,
method = c("single", "cluster"),
clusters = NULL,
genes = "de",
sd.thresh = 1,
de.method = "classic",
de.n = NULL,
de.args = list(),
aggr.ref = FALSE,
aggr.args = list(),
recompute = TRUE,
quantile = 0.8,
fine.tune = TRUE,
tune.thresh = 0.05,
prune = TRUE,
assay.type.test = "logcounts",
assay.type.ref = "logcounts",
check.missing = TRUE,
BNPARAM = KmknnParam(),
BPPARAM = SerialParam()
)
test |
A numeric matrix of single-cell expression values where rows are genes and columns are cells. Alternatively, a SummarizedExperiment object containing such a matrix. |
ref |
A numeric matrix of (usually log-transformed) expression values from a reference dataset,
or a SummarizedExperiment object containing such a matrix;
see Alternatively, a list or List of SummarizedExperiment objects or numeric matrices containing multiple references. Row names may be different across entries but only the intersection will be used, see Details. |
labels |
A character vector or factor of known labels for all samples in Alternatively, if |
method |
String specifying whether annotation should be performed on single cells in |
clusters |
A character vector or factor of cluster identities for each cell in |
genes, sd.thresh, de.method, de.n, de.args |
Arguments controlling the choice of marker genes used for annotation, see |
aggr.ref, aggr.args |
Arguments controlling the aggregation of the references prior to annotation, see |
recompute |
Logical scalar indicating whether to set up indices for later recomputation of scores,
when |
quantile, fine.tune, tune.thresh, prune |
Further arguments to pass to |
assay.type.test |
An integer scalar or string specifying the assay of |
assay.type.ref |
An integer scalar or string specifying the assay of |
check.missing |
Logical scalar indicating whether rows should be checked for missing values (and if found, removed). |
BNPARAM |
A BiocNeighborParam object specifying the algorithm to use for building nearest neighbor indices. |
BPPARAM |
A BiocParallelParam object specifying how parallelization should be performed, if any. |
If method="single", this function is effectively just a convenient wrapper around trainSingleR and classifySingleR.
If method="cluster", per-cell profiles are summed to obtain per-cluster profiles and annotation is performed on these clusters.
The function will automatically restrict the analysis to the intersection of the genes available in both ref and test.
If this intersection is empty (e.g., because the two datasets use different annotation in their row names), an error will be raised.
ref can contain both single-cell or bulk data, but in the case of the former, read the Note in ?trainSingleR.
A DataFrame is returned containing the annotation statistics for each cell or cluster (row).
This is identical to the output of classifySingleR.
Aaron Lun, based on code by Dvir Aran.
Aran D, Looney AP, Liu L et al. (2019). Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunology 20, 163–172.
##############################
## Mocking up training data ##
##############################
Ngroups <- 5
Ngenes <- 1000
means <- matrix(rnorm(Ngenes*Ngroups), nrow=Ngenes)
means[1:900,] <- 0
colnames(means) <- LETTERS[1:5]
g <- rep(LETTERS[1:5], each=4)
ref <- SummarizedExperiment(
list(counts=matrix(rpois(1000*length(g),
lambda=10*2^means[,g]), ncol=length(g))),
colData=DataFrame(label=g)
)
rownames(ref) <- sprintf("GENE_%s", seq_len(nrow(ref)))
ref <- scater::logNormCounts(ref)
trained <- trainSingleR(ref, ref$label)
###############################
## Mocking up some test data ##
###############################
N <- 100
g <- sample(LETTERS[1:5], N, replace=TRUE)
test <- SummarizedExperiment(
list(counts=matrix(rpois(1000*N, lambda=2^means[,g]), ncol=N)),
colData=DataFrame(cluster=g)
)
rownames(test) <- sprintf("GENE_%s", seq_len(nrow(test)))
test <- scater::logNormCounts(test)
###############################
## Performing classification ##
###############################
pred <- SingleR(test, ref, labels=ref$label)
table(predicted=pred$labels, truth=g)
pred2 <- SingleR(test, ref, labels=ref$label,
method="cluster", clusters=test$cluster)
table(predicted=pred2$labels, truth=rownames(pred2))