cor.fit.mixture {GeneTS} | R Documentation |
cor.fit.mixture
fits a mixture model
f(r) = eta0 dcor0(r, kappa) + (1-eta0) etaA fA
to a vector of empirical partial correlation coefficients using likelihood maximization.
This allows to estimates both the degree of freedom kappa
in the
null-distribution and the proportion eta0 of null r-values. The alternative distribution
is either assumed to be the unform dunif(r, -1, 1)
, or that it is an arbitrary
nonparametric distribution which vanishes for values of r near the center r=0.
cor.fit.mixture
also computes
etaA fA/f(r)
,
i.e. the (empirical) posterior probability that the true correlation is non-zero given the empirical correlation r, the degree of freedom of the null-distribution kappa, and the prior eta0 for the null-distribution.
cor.fit.mixture(r, MAXKAPPA=5000, fA.type=c("nonparametric", "uniform"), df=7, plot.locfdr=0)
r |
vector of sample correlations |
fA.type |
assumed type of alternative distribution |
MAXKAPPA |
upper bound for the estimated kappa (default: MAXKAPPA=5000) |
df |
degrees of freedom for the spline fitting the density (only if fA.type="nonparametric") |
plot.locfdr |
controls plot option in locfdr |
The above functions are useful to determine the null-distribution of edges in a sparse graphical Gaussian model, see Schaefer and Strimmer (2005) for more details and an application to infer genetic networks from microarray data.
For details on how to fit the empirical null distribution while at the same time non-parametrically
estimating the alternative hypothesis see Efron (2004) and the associated R package locfdr
.
A list object with the following components:
kappa |
the degree of freedom of the null distribution (see dcor0 ) |
eta0 |
the prior for the null distribution, i.e. the proportion of null r-values |
logL |
the maximized log-likelihood (only if fA.type="uniform" ) |
prob.nonzero |
empirical posterior probability that the observed correlations are non-zero. |
Juliane Schaefer (http://www.statistik.lmu.de/~schaefer/) and Korbinian Strimmer (http://www.statistik.lmu.de/~strimmer/).
Efron, B. (2004). Large-scale simulataneous hypothesis testing: the choice of a null hypothesis. JASA 99:96-104.
Schaefer, J., and Strimmer, K. (2005). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754-764.
dcor0
, cor0.estimate.kappa
,
kappa2n
, fdr.estimate.eta0
.
# load GeneTS library library("GeneTS") # simulate mixture distribution r <- rcor0(700, kappa=10) u <- runif(200, min=-1, max=1) rc <- c(r,u) # estimate kappa and eta0 (=7/9) c1 <- cor.fit.mixture(r, fA.type="uniform") c1$eta0 c1$kappa c2 <- cor.fit.mixture(rc, fA.type="uniform") c2$eta0 c2$kappa # for comparison cor0.estimate.kappa(r) cor0.estimate.kappa(rc)