assocTestAggregate {GENESIS} | R Documentation |
assocTestAggregate
performs aggregate association tests using the null model fit with fitNullModel
.
## S4 method for signature 'SeqVarIterator' assocTestAggregate(gdsobj, null.model, AF.max=1, weight.beta=c(1,1), weight.user=NULL, test=c("Burden", "SKAT", "SMMAT"), burden.test=c("Score", "Wald"), rho=0, pval.method=c("davies", "kuonen", "liu"), sparse=TRUE, imputed=FALSE, verbose=TRUE) ## S4 method for signature 'GenotypeIterator' assocTestAggregate(gdsobj, null.model, AF.max=1, weight.beta=c(1,1), weight.user=NULL, test=c("Burden", "SKAT", "SMMAT"), burden.test=c("Score", "Wald"), rho=0, pval.method=c("davies", "kuonen", "liu"), verbose=TRUE)
gdsobj |
An object of class |
null.model |
A null model object returned by |
AF.max |
A numeric value specifying the upper bound on the alternate allele frequency for variants to be included in the analysis. |
weight.beta |
A numeric vector of length two specifying the two parameters of the Beta distribution used to determine variant weights; weights are given by |
weight.user |
A character string specifying the name of a variable to be used as variant weights. This variable can be in either 1) the variantData slot of |
test |
A character string specifying the type of test to be performed. The possibilities are |
burden.test |
A character string specifying the type of Burden test to perform when |
rho |
A numeric value (or vector of numeric values) in |
pval.method |
A character string specifying which method to use
to calculate SKAT p-values. |
sparse |
Logical indicator of whether to read genotypes as sparse Matrix objects; the default is |
imputed |
Logical indicator of whether to read dosages from the "DS" field containing imputed dosages instead of counting the number of alternate alleles. |
verbose |
Logical indicator of whether updates from the function should be printed to the console; the default is |
The type of aggregate unit tested depends on the class of iterator
used for gdsobj
. Options include sliding windows, specific
ranges of variants or selection of individual variants (ranges with
width 1). See SeqVarIterator
for more details.
Monomorphic variants (including variants where every sample is a heterozygote) are always omitted from the aggregate unit prior to testing.
The effect size estimate is for each copy of the alternate allele. For multiallelic variants, each alternate allele is tested separately.
Somewhat similarly to SKAT-O, the variant Set Mixed Model Association Test (SMMAT, Chen et al., manuscript in preparation) combines the burden test p-value with an adjusted SKAT (which is asymptotically independent of the burden test) p-value using a chi-square distribution with 4df from Fisher's method.
A list with the following items:
results |
A data.frame containing the results from the main analysis. Each row is a separate aggregate test: |
If gdsobj
is a SeqVarWindowIterator
:
chr |
The chromosome value |
start |
The start position of the window |
end |
The end position of the window |
Always:
n.site |
The number of variant sites included in the test. |
n.alt |
The number of alternate alleles included in the test. |
n.sample.alt |
The number of samples with an observed alternate allele at any variant in the aggregate set. |
If test
is "Burden"
:
If burden.test
is "Score":
Score |
The value of the score function |
Score.SE |
The estimated standard error of the Score |
Score.Stat |
The score Z test statistic |
Score.pval |
The score p-value |
If burden.test
is "Wald"
:
Est |
The effect size estimate for a one unit increase in the burden value |
Est.SE |
The estimated standard error of the effect size estimate |
Wald.Stat |
The Wald Z test statistic |
Wald.pval |
The Wald p-value |
If test
is "SKAT"
:
Q_rho |
The SKAT test statistic for the value of rho specified. There will be as many of these variables as there are rho values chosen. |
pval_rho |
The SKAT p-value for the value of rho specified. There will be as many of these variables as there are rho values chosen. |
err_rho |
Takes value 1 if there was an error in calculating the p-value for the value of rho specified when using the "kuonen" or "davies" methods; 0 otherwise. When there is an error, the p-value returned is from the "liu" method. There will be as many of these variables as there are rho values chosen. |
When length(rho) > 1
and SKAT-O is performed:
min.pval |
The minimum p-value among the p-values calculated for each choice of rho. |
opt.rho |
The optimal rho value; i.e. the rho value that gave the minimum p-value. |
pval_SKATO |
The SKAT-O p-value after adjustment for searching across multiple rho values. |
If test
is "SMMAT"
:
pval_burden |
The burden test p-value |
pval_SMMAT |
The SMMAT p-value |
err |
Takes value 1 if there was an error calculating the SMMAT p-value; 0 otherwise. If |
variantInfo |
A list with as many elements as aggregate tests performed. Each element of the list is a data.frame providing information on the variants used in the aggregate test with results presented in the corresponding row of |
variant.id |
The variant ID |
chr |
The chromosome value |
pos |
The base pair position |
n.obs |
The number of samples with non-missing genotypes |
freq |
The estimated alternate allele frequency |
weight |
The weight assigned to the variant in the analysis. |
Matthew P. Conomos, Stephanie M. Gogarten, Tamar Sofer, Ken Rice, Chaoyu Yu, Han Chen
Leal, S.M. & Li, B. (2008). Methods for Detecting Associations with Rare Variants for Common Diseases: Application to Analysis of Sequence Data. American Journal of Human Genetics, 83(3): 311-321.
Browning, S.R. & Madsen, B.E. (2009). A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic. PLoS Genetics, 5(2): e1000384.
Wu, M.C, Lee, S., Cai, T., Li, Y., Boehnke, M., & Lin, X. (2011). Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test. American Journal of Human Genetics, 89(1): 82-93.
Lee, S. et al. (2012). Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies. American Journal of Human Genetics, 91(2): 224-237.
library(SeqVarTools) library(Biobase) library(GenomicRanges) # open a sequencing GDS file gdsfile <- seqExampleFileName("gds") gds <- seqOpen(gdsfile) # simulate some phenotype data data(pedigree) pedigree <- pedigree[match(seqGetData(gds, "sample.id"), pedigree$sample.id),] pedigree$outcome <- rnorm(nrow(pedigree)) # construct a SeqVarData object seqData <- SeqVarData(gds, sampleData=AnnotatedDataFrame(pedigree)) # fit the null model nullmod <- fitNullModel(seqData, outcome="outcome", covars="sex") # burden test - Range Iterator gr <- GRanges(seqnames=rep(1,3), ranges=IRanges(start=c(1e6, 2e6, 3e6), width=1e6)) iterator <- SeqVarRangeIterator(seqData, variantRanges=gr) assoc <- assocTestAggregate(iterator, nullmod, test="Burden") assoc$results lapply(assoc$variantInfo, head) # SKAT test - Window Iterator seqSetFilterChrom(seqData, include="22") iterator <- SeqVarWindowIterator(seqData) assoc <- assocTestAggregate(iterator, nullmod, test="SKAT") head(assoc$results) head(assoc$variantInfo) # SKAT-O test - List Iterator seqResetFilter(iterator) gr <- GRangesList( GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(16e6, 17e6), width=1e6)), GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(18e6, 20e6), width=1e6))) iterator <- SeqVarListIterator(seqData, variantRanges=gr) assoc <- assocTestAggregate(iterator, nullmod, test="SKAT", rho=seq(0, 1, 0.25)) assoc$results assoc$variantInfo # user-specified weights - option 1 seqResetFilter(iterator) variant.id <- seqGetData(gds, "variant.id") weights <- data.frame(variant.id, weight=runif(length(variant.id))) variantData(seqData) <- AnnotatedDataFrame(weights) iterator <- SeqVarListIterator(seqData, variantRanges=gr) assoc <- assocTestAggregate(iterator, nullmod, test="Burden", weight.user="weight") assoc$results assoc$variantInfo # user-specified weights - option 2 seqResetFilter(iterator) variantData(seqData)$weight <- NULL gr <- GRangesList( GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(16e6, 17e6), width=1e6), weight=runif(2)), GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(18e6, 20e6), width=1e6), weight=runif(2))) iterator <- SeqVarListIterator(seqData, variantRanges=gr) assoc <- assocTestAggregate(iterator, nullmod, test="Burden", weight.user="weight") assoc$results assoc$variantInfo seqClose(seqData)