getFeatureCounts {hiAnnotator} | R Documentation |
Given a query object and window size(s), the function finds all the rows in
subject which are <= window size/2 distance away. If weights are assigned to
each positions in the subject, then tallied counts are multiplied
accordingly. For large annotations, use getFeatureCountsBig
.
getFeatureCounts(sites.rd, features.rd, colnam = NULL, chromSizes = NULL, widths = c(1000, 10000, 1e+06), weightsColname = NULL, doInChunks = FALSE, chunkSize = 10000, parallel = FALSE)
sites.rd |
GRanges object to be used as the query. |
features.rd |
GRanges object to be used as the subject or the annotation table. |
colnam |
column name to be added to sites.rd for the newly calculated annotation...serves as a prefix to windows sizes! |
chromSizes |
named vector of chromosome/seqnames sizes to be used for testing if a position is off the mappable region. DEPRECATED and will be removed in future release. |
widths |
a named/numeric vector of window sizes to be used for casting
a net around each position. Default: |
weightsColname |
if defined, weigh each row from features.rd when tallying up the counts. |
doInChunks |
break up sites.rd into small pieces of chunkSize to perform the calculations. Default is FALSE. Useful if you are expecting to find great deal of overlap between sites.rd and features.rd. |
chunkSize |
number of rows to use per chunk of sites.rd. Default to 10000. Only used if doInChunks=TRUE. |
parallel |
use parallel backend to perform calculation with
|
a GRanges object with new annotation columns appended at the end of sites.rd. There will be a column for each width defined in widths parameter. If widths was a named vector i.e. c("100bp"=100,"1K"=1000), then the colname parameter will be pasted together with width name else default name will be generated by the function.
If parallel=TRUE, then be sure to have a parallel backend registered
before running the function. One can use any of the following libraries
compatible with foreach
: doMC, doSMP, doSNOW,
doMPI. For example: library(doMC); registerDoMC(2)
makeGRanges
, getNearestFeature
,
getSitesInFeature
, getFeatureCountsBig
.
# Convert a dataframe to GRanges object data(sites) alldata.rd <- makeGRanges(sites, soloStart = TRUE) data(genes) genes.rd <- makeGRanges(genes) geneCounts <- getFeatureCounts(alldata.rd, genes.rd, "NumOfGene") ## Not run: geneCounts <- getFeatureCounts(alldata.rd, genes.rd, "NumOfGene", doInChunks = TRUE, chunkSize = 200) geneCounts ## Parallel version of getFeatureCounts # geneCounts <- getFeatureCounts(alldata.rd, genes.rd, "NumOfGene", parallel = TRUE) # geneCounts ## End(Not run)