buildBackgroundModel {dagLogo} | R Documentation |
A method used to build background models for testing differential amino acid usage
buildBackgroundModel(dagPeptides, background = c("wholeProteome", "inputSet", "nonInputSet"), model = c("any", "anchored"), targetPosition = c("any", "Nterminus", "Cterminus"), uniqueSeq = FALSE, numSubsamples = 300L, rand.seed = 1, replacement = FALSE, testType = c("ztest", "fisher"), proteome)
dagPeptides |
An object of |
background |
A character vector with defaults: "wholeProteome" and "inputSet", "nonInputSet", indicating what set of peptide sequences should be considered to generate a background model. |
model |
A character vector with defaults: "any" and "anchored", indicating whether an anchoring position should be applied to generate a background model. |
targetPosition |
A character vector with defaults: "any", "Nterminus" and "Cterminus", indicating whether which part of protein sequences of choice should be used to generate a background model. |
uniqueSeq |
A logical vector indicating whether only unique peptide sequences are included in a background model for sampling. |
numSubsamples |
An integer, the number of random sampling. |
rand.seed |
An integer, the seed used to perform random sampling |
replacement |
A logical vector of length 1, indicating whether replacement is allowed for random sampling. |
testType |
A character vector of length 1. Available options are "ztest" and "fisher". |
proteome |
an object of Proteome, output of |
The background could be generated from wholeGenome, inputSet or nonInputSet. whole genome: randomly select subsequences from the whole genome with each subsequence containing amino acids with same width of input sequences. anchored whole genome: randomly select subsequences from the whole genome with each subsequence containing amino acids with same width of input sequences where the middle amino acids must contain anchor amino acid, e.g., K, which is specified by user. input set: same to whole genome, but only use protein sequence from input id and not including the site specified in input sequences anchored input set: same to anchored whole genome, but only use protein sequences from input id, and not including the site specified in input sequences. non-input set: whole genome - input set. anchored non-input set: whole genome - input set and the middle amino acids must contain anchor amino acid.
An object of dagBackground-class
Class.
Jianhong Ou, Haibo Liu
dat <- unlist(read.delim(system.file( "extdata", "grB.txt", package = "dagLogo"), header = FALSE, as.is = TRUE)) ##prepare an object of Proteome Class from a fasta file proteome <- prepareProteome(fasta = system.file("extdata", "HUMAN.fasta", package = "dagLogo"), species = "Homo sapiens") ##prepare an object of dagPeptides Class seq <- formatSequence(seq = dat, proteome = proteome, upstreamOffset = 14, downstreamOffset = 15) bg_fisher <- buildBackgroundModel(seq, background = "wholeProteome", proteome = proteome, testType = "fisher") bg_ztest <- buildBackgroundModel(seq, background = "wholeProteome", proteome = proteome, testType = "ztest")