reassignTSSbyCage {ORFik} | R Documentation |
Given a GRangesList of 5' UTRs or transcripts, reassign the start sites using max peaks from CageSeq data. A max peak is defined as new TSS if it is within boundary of 5' leader range, specified by 'extension' in bp. A max peak must also be higher than minimum CageSeq peak cutoff specified in 'filterValue'. The new TSS will then be the positioned where the cage read (with highest read count in the interval). If removeUnused is TRUE, leaders without cage hits, will be removed, if FALSE the original TSS will be used.
reassignTSSbyCage(fiveUTRs, cage, extension = 1000, filterValue = 1, restrictUpstreamToTx = FALSE, removeUnused = FALSE)
fiveUTRs |
(GRangesList) The 5' leaders or transcript sequences |
cage |
Either a filePath for CageSeq file, or already loaded CageSeq peak data as GRanges. |
extension |
The maximum number of basses upstream of the TSS to search for CageSeq peak. |
filterValue |
The minimum number of reads on cage position, for it to be counted as possible new tss. (represented in score column in CageSeq data) If you already filtered, set it to 0. |
restrictUpstreamToTx |
a logical (FALSE), if you want to restrict leaders to not extend closer than 5 bases from closest upstream leader, set this to TRUE. |
removeUnused |
logical (FALSE), remove leaders that did not have any cage support. (standard is to set them to original annotation) |
Note: If you used CAGEr, you will get reads of a probability region, with always score of 1. Remember then to set filterValue to 0. And you should use the 5' end of the read as input, use: ORFik:::convertToOneBasedRanges(cage)
a GRangesList of newly assigned TSS for fiveUTRs, using CageSeq data.
Other CAGE: assignTSSByCage
,
reassignTxDbByCage
# example 5' leader, notice exon_rank column fiveUTRs <- GenomicRanges::GRangesList( GenomicRanges::GRanges(seqnames = "chr1", ranges = IRanges::IRanges(1000, 2000), strand = "+", exon_rank = 1)) names(fiveUTRs) <- "tx1" # make fake CageSeq data from promoter of 5' leaders, notice score column cage <- GenomicRanges::GRanges( seqnames = "1", ranges = IRanges::IRanges(500, width = 1), strand = "+", score = 10) # <- Number of tags (reads) per position # notice also that seqnames use different naming, this will be fixed by ORFik # finally reassign TSS for fiveUTRs reassignTSSbyCage(fiveUTRs, cage)