ggm.estimate.pcor {GeneTS} | R Documentation |
ggm.estimate.pcor
implements various small-sample point estimators of partial
correlation that can be employed also for small sample data sets. Their statistical
properties are investigated in detail in Schaefer and Strimmer (2003).
ggm.estimate.pcor(x, method = c("observed.pcor", "partial.bagged.cor", "bagged.pcor"), R = 1000, ...)
x |
data matrix (each rows corresponds to one multivariate observation) |
method |
method used to estimate the partial correlation matrix. Available options are "observed.pcor" (default), "partial.bagged.cor", and "bagged.pcor". |
R |
number of bootstrap replicates (bagged estimators only) |
... |
options passed to partial.cor , bagged.cor ,
and bagged.pcor . |
The result can be summarized as follows (with N being the sample size, and G being the number of variables):
observed.pcor: Observed partial correlation (Pi-1). Should be used preferentially for N >> G. In this region the other two estimators perform equally well but are slower due to bagging.
partial.bagged.cor: Partial bagged correlation (Pi-2). Best used for small sample applications with N < G. Here the advantages of Pi-2 are its small variance, its high accuracy as a point estimate, and its overall best power and positive predictive value (PPV). In addition it is computationally less expensive than Pi-3.
bagged.pcor: Bagged partial correlation (Pi-3). May be used in the critical zone (N = G) and for sample sizes N slightly larger than the number of variables G.
As a result, this particularly promotes the partial bagged correlation Pi-3 as estimator of choice for the inference of GGM networks from small-sample (gene expression) data.
An estimated partial correlation matrix.
Juliane Schaefer (http://www.stat.uni-muenchen.de/~schaefer/) and Korbinian Strimmer (http://www.stat.uni-muenchen.de/~strimmer/).
Schaefer, J., and Strimmer, K. (2003). A practical approach to inferring large graphical models from sparse microarray data. Submitted to Bioinformatics [preprint available online].
ggm.simulate.data
,ggm.estimate.pcor
.
# load GeneTS library library(GeneTS) # generate random network with 40 nodes # it contains 780=40*39/2 edges of which 5 percent (=39) are non-zero true.pcor <- ggm.simulate.pcor(40) # simulate data set with 40 observations m.sim <- ggm.simulate.data(40, true.pcor) # simple estimate of partial correlations estimated.pcor <- partial.cor(m.sim) # comparison of estimated and true model sum((true.pcor-estimated.pcor)^2) # a slightly better estimate ... estimated.pcor.2 <- ggm.estimate.pcor(m.sim, method = c("bagged.pcor")) sum((true.pcor-estimated.pcor.2)^2)