idMap {EnrichmentBrowser}R Documentation

Mapping between gene ID types for the rownames of a SummarizedExperiment

Description

Functionality to map the rownames of a SummarizedExperiment between common gene ID types such as ENSEMBL and ENTREZ.

Usage

idMap(se, org = NA, from = "ENSEMBL", to = "ENTREZID",
  multi.to = "first", multi.from = "first")

idTypes(org)

Arguments

se

An object of class SummarizedExperiment. Expects the names to be of gene ID type given in argument from.

org

Character. Organism in KEGG three letter code, e.g. ‘hsa’ for ‘Homo sapiens’. See references.

from

Character. Gene ID type from which should be mapped. Corresponds to the gene ID type of the names of argument se. Note that from is ignored if to is a rowData column of se. Defaults to ENSEMBL.

to

Character. Gene ID type to which should be mapped. Corresponds to the gene ID type the rownames of argument se should be updated with. Note that this can also be the name of a column in the rowData slot of se to specify user-defined mappings in which conflicts have been manually resolved. Defaults to ENTREZID.

multi.to

How to resolve 1:many mappings, i.e. multiple to.IDs for a single from.ID? This is passed on to the multiVals argument of mapIds and can thus take several pre-defined values, but also the form of a user-defined function. However, note that this requires that a single to.ID is returned for each from.ID. Default is "first", which accordingly returns the first to.ID mapped onto the respective from.ID.

multi.from

How to resolve many:1 mappings, i.e. multiple from.IDs mapping to the same to.ID? Pre-defined options include:

  • 'first' (Default): returns the first from.ID for each to.ID with multiple from.IDs,

  • 'minp': selects the from.ID with minimum p-value (according to the rowData column PVAL of se),

  • 'maxfc': selects the from.ID with maximum absolute log2 fold change (according to the rowData column FC of se).

Note that a user-defined function can also be supplied for custom behaviors. This will be applied for each case where there are multiple from.IDs for a single to.ID, and accordingly takes the arguments ids and se. The argument ids corresponds to the multiple from.IDs from which a single ID should be chosen, e.g. via information available in argument se. See examples for a case where ids are selected based on a user-defined rowData column.

Details

The function 'idTypes' lists the valid values which the arguments 'from' and 'to' can take. This corresponds to the names of the available gene ID types for the mapping.

Value

idTypes: character vector listing the available gene ID types for the mapping;

idMap: An object of class SummarizedExperiment.

Author(s)

Ludwig Geistlinger <Ludwig.Geistlinger@sph.cuny.edu>

References

KEGG Organism code http://www.genome.jp/kegg/catalog/org_list.html

See Also

SummarizedExperiment, mapIds, keytypes

Examples


    # create an expression dataset with 3 genes and 3 samples
    se <- makeExampleData("SE", nfeat=3, nsmpl=3)
    names(se) <- paste0("ENSG00000000", c("003","005", "419"))
    mse <- idMap(se, org="hsa")

    # user-defined mapping
    rowData(se)$MYID <- c("g1", "g1", "g2")
    mse <- idMap(se, to="MYID")    

    # data-driven resolving of many:1 mappings
    
    ## e.g. select from.ID with lowest p-value
    pcol <- configEBrowser("PVAL.COL")
    rowData(se)[[pcol]] <- c(0.001, 0.32, 0.15)
    mse <- idMap(se, to="MYID", multi.from="minp") 
   
    ## ... or using a customized function
    maxScore <- function(ids, se)
    {
         scores <- rowData(se, use.names=TRUE)[ids, "SCORE"]
         ind <- which.max(scores)
         return(ids[ind])
    }
    rowData(se)$SCORE <- c(125.7, 33.4, 58.6)
    mse <- idMap(se, to="MYID", multi.from=maxScore) 
           



[Package EnrichmentBrowser version 2.14.3 Index]