dupcor.series {limma}R Documentation

Correlation Between Duplicates

Description

Estimate the correlation between duplicate spots (replicate spots on the same array) from a series of arrays.

Usage

duplicateCorrelation(object,design=rep(1,ncol(M)),ndups=2,spacing=1,initial=0.8,trim=0.15,weights=NULL)
dupcor.series(M,design=rep(1,ncol(M)),ndups=2,spacing=1,initial=0.7,trim=0.15,weights=NULL)

Arguments

object a numeric matrix of log-ratios or an MAList object from which the log-ratios can be extracted. If object is an MAList then the arguments design, ndups, spacing and weights will be extracted from it if available and do not have to be specified as arguments.
M a numeric matrix. Usually the log-ratios of expression for a series of cDNA microarrrays with rows corresponding to genes and columns to arrays.
design the design matrix of the microarray experiment, with rows corresponding to arrays and columns to comparisons to be estimated. The number of rows must match the number of columns of M. Defaults to the unit vector meaning that the arrays are treated as replicates.
ndups a positive integer giving the number of times each gene is printed on an array. nrow(M) must be divisible by ndups.
spacing the spacing between the rows of M corresponding to duplicate spots, spacing=1 for consecutive spots
initial a numeric value between -1 and 1 giving an initial estimate for the correlation.
trim the fraction of observations to be trimmed from each end of tanh(cor.genes) when computing the trimmed mean.
weights an optional numeric matrix of the same dimension as M containing weights for each spot. If smaller than M then it will be filled out the same size.

Details

This function estimates the between-duplicate correlation using REML individually for each gene. It also returns a robust average of the individual correlations which can be used as input for functions such as gls.series.

duplicateCorrelation is a more object-orientated version of dupcor.series but produces the same value.

Value

A list with components

cor the average estimated inter-duplicate correlation. The average is the 0.1 trimmed mean of the correlations for individual genes on the tanh-transformed scale.
cor.genes a numeric vector of length nrow(M)/ndups giving the individual gene correlations.

Note

This function may take long time to execute as it makes a call to gls for each gene. Execution could be speeded up greatly if it could be assumed that M contains no NAs.

Author(s)

Gordon Smyth

References

Smyth, G. K., Michaud, J., and Scott, H. (2003). The use of within-array duplicate spots for assessing differential expression in microarray experiments. http://www.statsci.org/smyth/pubs/dupcor.pdf

See Also

These functions use gls in the nlme package.

An overview of linear model functions in limma is given by 5.LinearModels.

Examples

#  See gls.series for an example

[Package Contents]