gscores {GenomicScores}R Documentation

Accessing genomic gscores

Description

Functions to access genomic gscores through GScores objects.

Usage

availableGScores()
getGScores(x)
## S4 method for signature 'GScores,GenomicRanges'
gscores(x, ranges, ...)
## S4 method for signature 'GScores,character'
gscores(x, ranges, ...)
## S4 method for signature 'MafDb,GenomicRanges'
gscores(x, ranges, ...)
## S4 method for signature 'GScores'
score(x, ..., simplify=TRUE)
## S4 method for signature 'MafDb'
score(x, ..., simplify=TRUE)

Arguments

x

For getGScores(), a character vector of length 1 specifiying the genomic scores resource to fetch. For gscores() and score(), a GScores object.

ranges

A GenomicRanges object with positions from where to retrieve genomic scores, or a character string vector with identifiers associated by the data producer to the genomic scores, e.g., dbSNP 'rs' identifiers.

...

In the call to the gscores() method one can additionally set the following arguments:

  • popCharacter string vector specifying the scores populations to query, when there is more than one. Use populations() to find out the available scores populations.

  • typeCharacter string specifying the type of genomic position being sought, which can be a single nucleotide range (snr), by default, or a nonsnr spanning multiple nucleotides. The latter is the case of indel variants in minor allele frequency data.

  • summaryFunFunction to summarize genomic scores when more than one position is retrieved. By default, this is set to the arithmetic mean, i.e., the mean() function.

  • quantizedFlag setting whether the genomic scores should be returned quantized (TRUE) or dequantized (FALSE, default).

  • refVector of reference alleles in the form of either a character vector, a DNAStringSet object or a DNAStringSetList object. This argument is used only when there are multiple scores per position.

  • altVector of alternative alleles in the form of either a character vector, a DNAStringSet object or a DNAStringSetList object. This argument is used only when there are multiple scores per position.

  • minoverlapInteger value passed internally to the function findOverlaps() from the IRanges package, when querying genomic positions associated with multiple-nucleotide ranges (nonSNRs). By default, minoverlap=1L, which assumes that the sought nonSNRs are stored as in VCF files, using the nucleotide composition of the reference sequence. This argument is only relevant for genomic scores associated with nonSNRs.

  • cachingFlag setting whether genomic scores per chromosome should be kept cached in memory (TRUE, default) or not (FALSE). The latter option minimizes the memory footprint but slows down the performance when the gscores() method is called multiple times.

simplify

Flag setting whether the result should be simplified to a vector (TRUE, default) if possible. This happens when scores from a single population are queried.

Details

The function availableGScores() shows genomic score sets available as AnnotationHub online resources.

The method gscores() takes as first argument a GScores-class object that can be loaded from an annotation package or from an AnnotationHub resource. These two possibilities are illustrated in the examples below.

Value

The function availableGScores() returns a character vector with the names of the AnnotationHub resources corresponding to different available sets of genomic scores. The function getGScores() return a GScores object. The method gscores() returns a GRanges object with the genomic scores in a metadata column called score. The method score() returns a numeric vector with the genomic scores.

Author(s)

R. Castelo

See Also

phastCons100way.UCSC.hg19 MafDb.1Kgenomes.phase1.hs37d5

Examples

## one genomic range of width 5
gr1 <- GRanges(seqnames="chr7", IRanges(start=117232380, width=5))
gr1

## five genomic ranges of width 1
gr2 <- GRanges(seqnames="chr7", IRanges(start=117232380:117232384, width=1))
gr2

## accessing genomic gscores from an annotation package
if (require(phastCons100way.UCSC.hg19)) {
  library(GenomicRanges)

  gsco <- phastCons100way.UCSC.hg19
  gsco
  gscores(gsco, gr1)
  score(gsco, gr1)
  gscores(gsco, gr2)
  populations(gsco)
  gscores(gsco, gr2, pop="DP2")
}

if (require(MafDb.1Kgenomes.phase1.hs37d5)) {
  mafdb <- MafDb.1Kgenomes.phase1.hs37d5
  mafdb
  populations(mafdb)

  ## lookup allele frequencies for SNP rs1129038, located at 15:28356859, a
  ## SNP associated to blue and brown eye colors as reported by Eiberg et al.
  ## Blue eye color in humans may be caused by a perfectly associated founder
  ## mutation in a regulatory element located within the HERC2 gene
  ## inhibiting OCA2 expression. Human Genetics, 123(2):177-87, 2008
  ## [http://www.ncbi.nlm.nih.gov/pubmed/18172690]
  gscores(mafdb, GRanges("15:28356859"), pop=populations(mafdb))
  gscores(mafdb, "rs1129038", pop=populations(mafdb))
}

## accessing genomic scores from AnnotationHub resources
## Not run: 
availableGScores()
gsco <- getGScores("phastCons100way.UCSC.hg19")
gscores(gsco, gr1)

## End(Not run)

[Package GenomicScores version 1.8.0 Index]