ggm.test.edges {GeneTS} | R Documentation |
ggm.list.edges
returns a table with all correlations sorted according to magnitude,
as well as the two node numbers associated with each edge.
ggm.test.edges
in addition assigns statistical significance to the edges in a GGM network by computing
p-values, q-values and posterior probabilities for each potential edge.
ggm.list.edges(r.mat) ggm.test.edges(r.mat, df=7, plot.locfdr=1)
r.mat |
matrix of partial correlations |
df |
degrees of freedom for the spline fitting the density (only if fA.type="nonparametric") |
plot.locfdr |
controls plot option in locfdr |
A mixture model is fitted to the partial correlations using cor.fit.mixture
Subsequently, two-sided p-values to test non-zero correlation are computed for each edge using
cor0.test
. In addition, corresponding posterior probabilities are
computed (also using cor.fit.mixture
. Finally, to simplify multiple testing q-values
are computed via fdr.control
whith the specified value of eta0
taken
into account.
Theoretical details are explained in Schaefer and Strimmer (2005), along with a simulation study and an application to gene expression data.
A sorted data frame with the following columns:
pcor |
correlation (from r.mat) |
node1 |
first node connected to edge |
node2 |
second node connected to edge |
pval |
p-value |
qval |
q-value |
prob |
probability that edge is nonzero (= 1-local fdr |
Each row in the data frame corresponds to one edge, and the rows are sorted
according the absolute strength of the correlation (from strongest to weakest)
ggm.list.edges
returns only the first three columns.
Juliane Schaefer (http://www.stat.math.ethz.ch/~schaefer/) and Korbinian Strimmer (http://www.statistik.lmu.de/~strimmer/).
Schaefer, J., and Strimmer, K. (2005). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754-764.
cor.fit.mixture
,
cor0.test
,
fdr.control
,
ggm.estimate.pcor
.
# load GeneTS library library("GeneTS") # ecoli data data(ecoli) # estimate partial correlation matrix inferred.pcor <- ggm.estimate.pcor(ecoli) # p-values, q-values and posterior probabilities for each edge # test.results <- ggm.test.edges(inferred.pcor) # show best 20 edges (strongest correlation) test.results[1:20,] # how many are significant based on FDR cutoff Q=0.05 ? num.significant.1 <- sum(test.results$qval <= 0.05) test.results[1:num.significant.1,] # how many are significant based on "local fdr" cutoff (prob > 0.95) ? num.significant.2 <- sum(test.results$prob > 0.95) test.results[1:num.significant.2,] # parameters of the mixture distribution used to compute p-values etc. c <- cor.fit.mixture(sm2vec(inferred.pcor)) c$eta0 c$kappa