affiXcanTrain {AffiXcan}R Documentation

Train the model needed to impute a GReX for each gene

Description

Train the model needed to impute a GReX for each gene

Usage

affiXcanTrain(exprMatrix, assay, tbaPaths, regionAssoc, cov, varExplained,
  scale, BPPARAM = bpparam())

Arguments

exprMatrix

A SummarizedExperiment object containing expression data

assay

A string with the name of the object in SummarizedExperiment::assays(exprMatrix) that contains expression values

tbaPaths

A vector of strings, which are the paths to MultiAssayExperiment RDS files containing the tba values

regionAssoc

A data.frame with the association between regulatory regions and expressed genes and with colnames = c("REGULATORY_REGION", "EXPRESSED_REGION")

cov

A data.frame with covariates values for the population structure where the columns are the PCs and the rows are the individual IIDs

varExplained

An integer between 0 and 100; varExplained=80 means that the principal components selected to fit the models must explain at least 80 percent of variation of TBA values

scale

A logical; if scale=FALSE the TBA values will be only centered, not scaled before performing PCA

BPPARAM

A BiocParallelParam object. Default is bpparam(). For details on BiocParallelParam virtual base class see browseVignettes("BiocParallel")

Value

A list containing three objects: pca, bs, regionsCount

pca: A list containing lists named as the MultiAssayExperiment::experiments() found in the MultiAssayExperiment objects listed in the param tbaPaths. Each of these lists contains two objects:

eigenvectors: A matrix containing eigenvectors for those principal components selected according to the param varExplained

pcs: A matrix containing the principal components values selected according to the param varExplained

bs: A list containing lists named as the REGULATORY_REGIONS found in the param regionAssoc that have a correspondent colname in the experiments stored in MultiAssayExperiment objects listed in the param tbaPaths. Each of the lists in bs contains four objects:

coefficients: The coefficients of the principal components used in the model, completely similar to the "coefficients" from the results of lm()

pval: The uncorrected anova pvalue of the model, retrieved from anova(model, modelReduced, test="F")$'Pr(>F)'[2]

r.sq: The coefficient of determination between the real total expression values and the imputed GReX, retrived from summary(model)$r.squared

correctedP: The p value after the benjamini-hochberg correction for multiple testing, retrived using p.adjust(pvalues, method="BH")

regionsCount: An integer, that is the number of genomic regions taken into account during the training phase

Examples

if(interactive()) {
trainingTbaPaths <- system.file("extdata","training.tba.toydata.rds",
package="AffiXcan")

data(exprMatrix)
data(regionAssoc)
data(trainingCovariates)

assay <- "values"

training <- affiXcanTrain(exprMatrix=exprMatrix, assay=assay,
tbaPaths=trainingTbaPaths, regionAssoc=regionAssoc, cov=trainingCovariates,
varExplained=80, scale=TRUE)
}

[Package AffiXcan version 1.2.0 Index]