findMapORFs {ORFik} | R Documentation |
Finds ORFs on the sequences of interest, but returns relative positions to the positions of 'grl' argument. For example, 'grl' can be exons of known transcripts (with genomic coordinates), and 'seq' sequences of those transcripts, in that case, this function will return genomic coordinates of ORFs found on transcript sequences.
findMapORFs(grl, seqs, startCodon = startDefinition(1), stopCodon = stopDefinition(1), longestORF = TRUE, minimumLength = 0, groupByTx = TRUE)
grl |
( |
seqs |
(DNAStringSet or character vector) - DNA/RNA sequences to search for Open Reading Frames. Can be both uppercase or lowercase. Easiest call to get seqs if you want only regions from a fasta/fasta index pair is: seqs = ORFik:::txSeqsFromFa(grl, faFile), where grl is a GRanges/List of regions and faFile is a FaFile. |
startCodon |
(character vector) Possible START codons to search for.
Check |
stopCodon |
(character vector) Possible STOP codons to search for.
Check |
longestORF |
(logical) Default TRUE. Keep only the longest ORF per
unique (seqname, strand, stopcodon) combination, you can also use function
|
minimumLength |
(integer) Default is 0. Which is START + STOP = 6 bp. Minimum length of ORF, without counting 3bp for START and STOP codons. For example minimumLength = 8 will result in size of ORFs to be at least START + 8*3 (bp) + STOP = 30 bases. Use this param to restrict search. |
groupByTx |
logical (T), should output GRangesList be grouped by orfs per transcript (T) or by exons per ORF (F)? |
This function assumes that 'seq' is in widths relative to 'grl', and that their orders match. 1st seq is 1st grl object, etc.
A GRangesList of ORFs.
Other findORFs: findORFsFasta
,
findORFs
, startDefinition
,
stopDefinition
# This sequence has ORFs at 1-9 and 4-9 seqs <- c("ATGATGTAA") # the dna sequence findORFs(seqs) # lets assume that this sequence comes from two exons as follows gr <- GRanges(seqnames = rep("1", 2), # chromosome 1 ranges = IRanges(start = c(21, 10), end = c(23, 15)), strand = rep("-", 2), names = rep("tx1", 2)) grl <- GRangesList(tx1 = gr) findMapORFs(grl, seqs) # ORFs are properly mapped to its genomic coordinates grl <- c(grl, grl) names(grl) <- c("tx1", "tx2") findMapORFs(grl, c(seqs, seqs))