buildBackgroundModel {dagLogo}R Documentation

Build background models for DAU tests

Description

A method used to build background models for testing differential amino acid usage

Usage

buildBackgroundModel(dagPeptides, background = c("wholeProteome",
  "inputSet", "nonInputSet"), model = c("any", "anchored"),
  targetPosition = c("any", "Nterminus", "Cterminus"),
  uniqueSeq = FALSE, numSubsamples = 300L, rand.seed = 1,
  replacement = FALSE, testType = c("ztest", "fisher"), proteome)

Arguments

dagPeptides

An object of dagPeptides-class Class containing peptide sequences as the input set.

background

A character vector with defaults: "wholeProteome" and "inputSet", "nonInputSet", indicating what set of peptide sequences should be considered to generate a background model.

model

A character vector with defaults: "any" and "anchored", indicating whether an anchoring position should be applied to generate a background model.

targetPosition

A character vector with defaults: "any", "Nterminus" and "Cterminus", indicating whether which part of protein sequences of choice should be used to generate a background model.

uniqueSeq

A logical vector indicating whether only unique peptide sequences are included in a background model for sampling.

numSubsamples

An integer, the number of random sampling.

rand.seed

An integer, the seed used to perform random sampling

replacement

A logical vector of length 1, indicating whether replacement is allowed for random sampling.

testType

A character vector of length 1. Available options are "ztest" and "fisher".

proteome

an object of Proteome, output of prepareProteome

Details

The background could be generated from wholeGenome, inputSet or nonInputSet. whole genome: randomly select subsequences from the whole genome with each subsequence containing amino acids with same width of input sequences. anchored whole genome: randomly select subsequences from the whole genome with each subsequence containing amino acids with same width of input sequences where the middle amino acids must contain anchor amino acid, e.g., K, which is specified by user. input set: same to whole genome, but only use protein sequence from input id and not including the site specified in input sequences anchored input set: same to anchored whole genome, but only use protein sequences from input id, and not including the site specified in input sequences. non-input set: whole genome - input set. anchored non-input set: whole genome - input set and the middle amino acids must contain anchor amino acid.

Value

An object of dagBackground-class Class.

Author(s)

Jianhong Ou, Haibo Liu

Examples

dat <- unlist(read.delim(system.file(
                                   "extdata", "grB.txt", package = "dagLogo"),
                         header = FALSE, as.is = TRUE))
##prepare an object of Proteome Class from a fasta file
proteome <- prepareProteome(fasta = system.file("extdata",
                                                "HUMAN.fasta",
                                                package = "dagLogo"), 
                            species = "Homo sapiens")
                            
##prepare an object of dagPeptides Class
seq <- formatSequence(seq = dat, proteome = proteome, upstreamOffset = 14,
                     downstreamOffset = 15)
bg_fisher <- buildBackgroundModel(seq, background = "wholeProteome", 
                                  proteome = proteome, testType = "fisher")
bg_ztest <- buildBackgroundModel(seq, background = "wholeProteome",
                                   proteome = proteome, testType = "ztest")

[Package dagLogo version 1.22.0 Index]