calculateMotifEnrichment {transite}R Documentation

Binding Site Enrichment Value Calculation

Description

This function is used to calculate binding site enrichment / depletion scores between predefined foreground and background sequence sets. Significance levels of enrichment values are obtained by Monte Carlo tests.

Usage

calculateMotifEnrichment(foreground.scores.df, background.scores.df,
  background.total.sites, background.absolute.hits,
  n.transcripts.foreground, max.fg.permutations = 1e+06,
  min.fg.permutations = 1000, e = 5, p.adjust.method = "BH")

Arguments

foreground.scores.df

result of scoreTranscripts on foreground sequence set (foreground sequence sets must be a subset of the background sequence set)

background.scores.df

result of scoreTranscripts on background sequence set

background.total.sites

number of potential binding sites per sequence (returned by scoreTranscripts)

background.absolute.hits

number of putative binding sites per sequence (returned by scoreTranscripts)

n.transcripts.foreground

number of sequences in the foreground set

max.fg.permutations

maximum number of foreground permutations performed in Monte Carlo test for enrichment score

min.fg.permutations

minimum number of foreground permutations performed in Monte Carlo test for enrichment score

e

integer-valued stop criterion for enrichment score Monte Carlo test: aborting permutation process after observing e random enrichment values with more extreme values than the actual enrichment value

p.adjust.method

adjustment of p-values from Monte Carlo tests to avoid alpha error accumulation, see p.adjust

Value

A data frame with the following columns:

motif.id the motif identifier that is used in the original motif library
motif.rbps the gene symbol of the RNA-binding protein(s)
enrichment binding site enrichment between foreground and background sequences
p.value unadjusted p-value from Monte Carlo test
p.value.n number of Monte Carlo test permutations
adj.p.value adjusted p-value from Monte Carlo test (usually FDR)

See Also

Other matrix functions: runMatrixSPMA, runMatrixTSMA, scoreTranscriptsSingleMotif, scoreTranscripts

Examples

foreground.seqs <- c("CAGUCAAGACUCC", "AAUUGGUGUCUGGAUACUUCCCUGUACAU",
  "AGAU", "CCAGUAA")
background.seqs <- c(foreground.seqs, "CAACAGCCUUAAUU", "CUUUGGGGAAU",
                     "UCAUUUUAUUAAA", "AUCAAAUUA", "GACACUUAAAGAUCCU",
                     "UAGCAUUAACUUAAUG", "AUGGA", "GAAGAGUGCUCA",
                     "AUAGAC", "AGUUC")
foreground.scores <- scoreTranscripts(foreground.seqs, cache = FALSE)
background.scores <- scoreTranscripts(background.seqs, cache = FALSE)
enrichments.df <- calculateMotifEnrichment(foreground.scores$df,
  background.scores$df,
  background.scores$total.sites, background.scores$absolute.hits,
  length(foreground.seqs),
  max.fg.permutations = 1000
)

[Package transite version 1.2.1 Index]