combineVar {scran}R Documentation

Combine variance decompositions

Description

Combine the results of multiple variance decompositions, usually generated for the same genes across separate batches of cells.

Usage

combineVar(..., method="fisher", weighted=TRUE)

Arguments

...

Two or more DataFrames produced by decomposeVar.

method

String specifying how p-values are to be combined, see combinePValues for options.

weighted

Logical scalar indicating whether weights should be used for combining statistics.

Details

This function is designed to merge results from multiple calls to decomposeVar, usually computed for different batches of cells. Separate variance decompositions are necessary in cases where different concentrations of spike-in have been added to the cells in each batch. This affects the technical mean-variance relationship and precludes the use of a common trend fit.

The output mean is computed as a weighted average of the means in each input DataFrame, where the weight is defined as the number of cells in that batch. This yields an equivalent value to the sample mean across all cells in all batches. Similarly, weighted averages are computed for all variance components, where the weight is defined as the residual d.f. used for variance estimation in each batch. This yields a variance equivalent to the residual variance obtained while blocking on the batch of origin.

Weighting can be turned off with weighted=FALSE. This may be useful to ensure that all batches contribute equally to the calculation of the combined statistics, avoiding cases where batches with many cells dominate the output. Of course, this comes at the cost of precision - large batches contain more information and should contribute more to the weighted average.

For the p-value calculations, options are taken from combinePValues and are paraphrased here:

Only method="z" will perform any weighting of batches, and only if weighted=TRUE. Here, each batch is weighted according to the residual d.f. used for testing in that batch. In all other cases, all batches are assigned equal weight for p-value calculations.

Value

A DataFrame with the same numeric fields as that produced by decomposeVar. Each field contains the average across all batches except for p.value, which contains the combined p-value based on method; and FDR, which contains the adjusted p-value using the BH method.

Author(s)

Aaron Lun

See Also

decomposeVar, combinePValues

Examples

example(computeSpikeFactors) # Using the mocked-up data 'y' from this example.
y <- computeSumFactors(y) # Size factors for the the endogenous genes.
y <- computeSpikeFactors(y, general.use=FALSE) # Size factors for spike-ins. 

y1 <- y[,1:100] 
y1 <- normalize(y1) # normalize separately after subsetting.
fit1 <- trendVar(y1)
results1 <- decomposeVar(y1, fit1)

y2 <- y[,1:100 + 100] 
y2 <- normalize(y2) # normalize separately after subsetting.
fit2 <- trendVar(y2)
results2 <- decomposeVar(y2, fit2)

head(combineVar(results1, results2))
head(combineVar(results1, results2, method="simes"))
head(combineVar(results1, results2, method="berger"))

[Package scran version 1.12.0 Index]