mergeSAGE {SAGElyzer} | R Documentation |
These functions merge individual SAGE libraries based on unique SAGE tags and write the merged data into a file and a table in a database with the unique SAGE tags as one column and counts from all the libraries as the others.
mergeSAGE(libNames, isDir = TRUE, skip = 1, pattern = ".sage") getLibInfo <- function(fileNames
libNames |
libNames - a vector of character strings for
the name of the SAGE libraries to be merged. libNames can be
the name of the directory containing SAGE libraries to be merged |
isDir |
isDir - a boolean that is TRUE if libNames is the
name for the directory that contains SAGE libraries to be merged |
skip |
skip - an integer for the number of lines to be
skiped when the libraries are merged |
pattern |
pattern - a character string for the pattern to
be used to get the file SAGE data files from the directory when
libNames is for a directory. Only files that match the
pattern will be merged |
Each SAGE library typically contains two columns with the first one
being SAGE tags and the second one being their
counts. mergeSAGE
merges library files based on the
tags. Tags that are missing from a given library but exist in other
will be assigned 0s for the library.
mergeSAGE
will generate two files. One contains the
merged data and the other contains four columns with the first one
being the column names of the database table to store the SAGE counts,
the second one being the original SAGE library names, the third being
the normalization factor that will be used to normalize counts based
on the library with the smallest number of tags, and the forth being
the factor based on the library with the largest number of tag.
getLibInfo
creates the file that contains the
information about the data file.
calNormFact
calculates the normalization factor.
mergeSAGE
returns a list containing two file names
data |
a character string for the name of the file containing the merged data |
info |
a character string for the name of the file containing information about the merged data |
getLibInfo
returns a matrix with four columns.
The functions are part of the Bioconductor project at Dana-Farber Cancer Institute to provide Bioinformatics functionalities through R
Jianhua Zhang
http://www.ncbi.nlm.nih.gov/geo
path <- tempdir() # Create two libraries lib1 <- cbind(paste("tag", 1:10, sep = ""), 1:10) lib2 <- cbind(paste("tag", 5:9, sep = ""), 15:19) write.table(lib1, file = file.path(path, "lib1.sage"), sep = "\t", row.names = FALSE, col.names = FALSE) write.table(lib2, file = file.path(path, "lib2.sage"), sep = "\t", row.names = FALSE, col.names = FALSE) libNNum <- getLibNNum(c(file.path(path, "lib1.sage"), file.path(path, "lib2.sage"))) normFact <- calNormFact("min", libNNum) uniqTag <- getUniqTags(c(file.path(path, "lib1.sage"), file.path(path, "lib2.sage")), skip = 0)