plgem.fit {plgem}R Documentation

PLGEM Fitting and Evaluation

Description

Function for fitting and evaluating goodness of fit of PLGEM on a ‘data’ exprSet, using the condition ‘fit.condition’ containing replicates; partitioning the range of expression values in ‘p’ intervals, using the ‘q’-th quantile of expression value standard deviations.

Usage

plgem.fit(data, fit.condition, p = 10, q = 0.5, fittingEval = FALSE,
plot.file = FALSE, verbose = FALSE)

Arguments

data an object of class ‘exprSet’ with a ‘conditionName’ covariate, see details.
fit.condition number; the condition used for plgem fitting, according to the order of unique values of conditionName covariate.
p number of intervals used to partition the expression value range.
q number in [0,1]; the quantile of standard deviation used for PLGEM fitting.
fittingEval logical; if TRUE, the fitting is evaluated generating a diagnostic plot.
plot.file logical; if TRUE, a png file is written on the current working directory.
verbose logical; if TRUE, comments are printed out while running.

Details

‘plgem.fit’ fits PLGEM on an expression set and eventually evaluates goodness of fit. This Power Law Global Error Model aims to find the mathematical relationship between standard deviation and mean expression values in a set of replicated microarray samples, according to a power law:

ln(modeledSpread) = PLGEMslope * ln(mean) + PLGEMintercept

The exprSet ‘data’ must have a phenoData slot with a covariate called ‘conditionName’. The values of this covariate must be sample labels, that have to be identical for samples to be treated as replicates. This function returns ‘SLOPE’ and ‘INTERCEPT’ of this power law; moreover it returns the Pearson's coefficient of correlation ‘DATA.PEARSON’ of the linear model fitted on the original data, as well as the adjusted R squared ‘ADJ.R2.MP’ of the linear model fitted on the modelling points.

If argument ‘fittingEval’ is TRUE, a graphical control of the goodness of the plgem fitting is produced and a plot containing four panels is generated. The top-left panel shows the power law, characterized by ‘SLOPE’ and ‘INTERCEPT’. The top-right panel represents the distribution of model residuals. The bottom-left reports the contour plot of ranked residuals. The bottom-right panel finally shows the relationship between the distribution of observed residuals and the normal distribution. The goodness of the fit is principally judged by an horizontal symmetric rank-plot and a near normal distribution of residuals.

Value

‘plgem.fit’ returns a list of five numbers (see details):

SLOPE the slope of the fitted PLGEM.
INTERCEPT the intercept of the fitted PLGEM.
DATA.PEARSON the Pearson correlation coefficient of the linear model fitted on the original data.
ADJ.R2.MP the adjusted R squared of PLGEM fitted on the modelling points.
FIT.CONDITION the condition used for fitting PLGEM.

Author(s)

Mattia Pelizzola mattia.pelizzola@unimib.it and Norman Pavelka norman.pavelka@unimib.it

References

N. Pavelka et al., BMC Bioinformatics, 2004 Dec 17;5(1):203; http://www.genopolis.it

See Also

plgem.obsStn,plgem.resampledStn,plgem.deg,run.plgem

Examples

data(LPSeset)
LPSfit<-plgem.fit(data = LPSeset, fittingEval = TRUE)

[Package plgem version 1.8.0 Index]