hyperGoutput {affycoretools}R Documentation

Output Tables Based on Hypergeometric Test

Description

This function will output various tables containing probesets that are annotated to a particular GO, KEGG, or PFAM term. The tables are based on the results from a call to hyperGtest.

Usage

hyperGoutput(hyptObj, eset, pvalue, categorySize, fit = NULL,
subset = NULL,comp = 1, output = c("selected", "all", "split"),
statistics = c("tstat", "pval", "FC"), html = TRUE, text = TRUE, ...)

Arguments

hyptObj A HyperGResult object, usually produced by a call to hyperGtest
eset An ExpressionSet object
pvalue The p-value cutoff used for selecting significant GO terms. If not specified, it will be extracted from the HyperGResult object
categorySize Number of terms in the universe required for a term to be significant. See details for more information
fit An MArrayLM object, produced from a call to eBayes
subset Numeric vector used to select particular tables to output. The default is to output tables for all terms. See details for more information
comp Numeric vector of length one, used to indicate which comparison in the MArrayLM object to use for extracting relevant statistics. See details for more information
output One of 'selected', 'all', or 'split'. See details for more information
statistics Which statistics to output in the resulting tables. Choices include 'tstat', 'pval', or 'FC', corresponding to t-statistics, p-values, and fold change, respectively
html Boolean. Output HTML tables? Defaults to TRUE
text Boolean. Output text tables? Defaults to TRUE
... Allows end user to pass further arguments. The most notable would be an anncols argument, passed to probes2table to control the hyperlinked annotation columns. See aaf.handler for more information

Details

This function is designed to be used to output the results from a hypergeometric test for over-represented terms. This function would be used at the end of an analysis such as:

1.) Compute expression values 2.) Fit a model using limma 3.) Output significant probesets using limma2annaffy 4.) Perform hypergeometric test using hyperGtest

At step 4, one can output a list of the over-represented terms using htmlReport. One might then be interested in knowing which probesets contributed to the significance of a particular term, which is what this function is designed to do.

One argument that can be passed to htmlReport (and also to hyperGoutput) is categorySize, which gives a lower bound for the number of probesets with a particular term in the universe. In other words, assume that a particular GO term is annotated to three probesets on a given chip. If, after doing a t-test to detect differentially expressed probesets, one of those probesets were found to be significantly differentially expressed and was then used to do a hypergeometric test, that GO term would be significant, with a small p-value. However, this is probably not very strong evidence that the GO term is actually over-represented, since there were only three to begin with. By setting categorySize to a sensible value (such as 10), this situation can be avoided.

This function will output HTML and/or text tables containing annotation information about each probeset as well as the expression values. In addition, if limma were used to fit the model, the relevant statistics (t-statistic, p-value, fold change) can also be output in the table by passing the MArrayLM object that resulted from a call to eBayes. The statistics argument can be used to control which statistics are output.

By default hyperGoutput will output tables for all significant terms, which may end up being quite a few tables. Usually only a few terms are of interest, so there is a subset argument that can be used to select only those terms. This argument follows directly from the order of the table output by htmlReport or summary. For instance, if the first, third and fifth terms in the HTML table output by htmlReport were of interest, one would use subset=c(1,3,5).

One critical step prior to the hypergeometric test is to subset the probesets to unique Entrez Gene IDs. It should be noted however, that the functions used by hypergOutput will output all the probesets annotated to a particular term. The output argument is used to control this behavior. If output = "selected" (the default), then only those probesets that correspond to the original subsetting will be output. If output = "all", then all probesets will be output (grouped by Entrez ID), with the 'selected' probeset first. If output = "split", then all the probesets will be output, with all the 'selected' probesets first, followed by the other probesets, grouped by Entrez ID.

Note that this functionality is reliant on the probesets being subsetted in a manner that keeps the probeset ID appended to the Entrez Gene ID. This can be accomplished by using either findLargest or getUniqueLL.

Value

This function returns no value, and is called solely for the side effect of outputting HTML and/or text tables.

Author(s)

James W. MacDonald <jmacdon@med.umich.edu>

See Also

hyperGTest, htmlReport, probeSetSummary


[Package affycoretools version 1.8.1 Index]