geneGoHyperGeoTest {Category}R Documentation

Hypergeometric Tests for GO

Description

Given a set of unique Entrez Gene Identifiers, a microarray annotation data package name, and the GO category of interest, this function will compute Hypergeomtric p-values for overrepresentation of each GO term in the specified category among the GO annotations for the interesting genes (as indicated by the Entrez Gene ids).

Usage

geneGoHyperGeoTest(entrezGeneIds, lib, ontology, universe=NULL)

Arguments

entrezGeneIds A vector of Entrez Gene Identifiers
lib A string giving the name of the annotation data package to use. This must correspond to the microarray chip type that the data came from
ontology One of "BP", "CC", or "MF" used to determine which GO ontology to use.
universe A character vector of unique Entrez Gene identifiers. This is the population (the urn) of the Hypergeometric test. When NULL (default), the population is all Entrez Gene ids in the annotation package that have a GO term annotation in the specified GO category (see details).

Details

The Entrez Gene ids given in entrezGeneIds define the selected set of genes. The universe of Entrez Gene ids is determined by the chip annotation data package (lib) or specified by the universe argument which must be a subset of the Entrez Gene ids represented on the chip. Both the selected genes and the universe are reduced by removing Entrez Gene ids that do not have any annotations in the specified GO category.

For each GO term in the specified category that has at least one annotation in the selected gene set (entrezGeneIds), we determine how many of its Entrez Gene annotations are in the universe set and how many are in the selected set. With these counts we perform a Hypergeometric test using phyper. This is equivalent to using Fisher's exact test.

It is important that the correct chip annotation data package be identified as it determines the GO term to Entrez Gene id mapping as well as the universe of Entrez Gene ids in the case that the 'universe' argument is omitted.

For S. cerevisiae if the 'lib' argument is set to "YEAST" then comparisons and statistics are computed using common names and are with respect to all genes annotated in the S. cerevisiae genome not with respect to any microarray chip. This will not be the right thing to do if you are working with a yeast microarray.

Value

A GeneCategoryHyperGeoTestResult-class instance.

Author(s)

S. Falcon

See Also

GeneCategoryHyperGeoTestResult-class GeneCategoryHyperGeoTestParams-class geneKeggHyperGeoTest geneCategoryHyperGeoTest

Examples

library("hgu95av2")
library("GO")
set.seed(123)
probes <- ls(hgu95av2LOCUSID)
probes <- sample(probes, 100)
egIds <- unique(unlist(mget(probes, hgu95av2LOCUSID)))
ans <- geneGoHyperGeoTest(egIds, "hgu95av2", "BP")
print(ans)

[Package Category version 1.4.1 Index]