findORFsFasta {ORFik} | R Documentation |
Should be used for procaryote genomes or transcript sequences as fasta. Makes no sence for eukaryote whole genomes, since it contains splicing. Searches through each fasta header and reports all ORFs found for BOTH sense (+) and antisense strand (-) in all frames. Name of the header will be used as seqnames of reported ORFs. Each fasta header is treated separately, and name of the sequence will be used as seqname in returned GRanges object. This supports circluar genomes.
findORFsFasta(filePath, startCodon = startDefinition(1), stopCodon = stopDefinition(1), longestORF = TRUE, minimumLength = 0, is.circular = FALSE)
filePath |
(character) Path to the fasta file. Can be both uppercase or lowercase. |
startCodon |
(character vector) Possible START codons to search for.
Check |
stopCodon |
(character vector) Possible STOP codons to search for.
Check |
longestORF |
(logical) Default TRUE. Keep only the longest ORF per
unique (seqname, strand, stopcodon) combination, you can also use function
|
minimumLength |
(integer) Default is 0. Which is START + STOP = 6 bp. Minimum length of ORF, without counting 3bp for START and STOP codons. For example minimumLength = 8 will result in size of ORFs to be at least START + 8*3 (bp) + STOP = 30 bases. Use this param to restrict search. |
is.circular |
(logical) Whether the genome in filePath is circular. Prokaryotic genomes are usually circular. Be carefull if you want to extract sequences, remember that seqlengths must be set, else it does not know what last base in sequence is before loop ends! |
Remember if you have a fasta file of transcripts (transcript coordinates), delete all negative stranded ORFs afterwards by: orfs <- orfs[strandBool(orfs)] # negative strand orfs make no sense then. Seqnames are created from header by format: >name info, so name must be first after "biggern than" and space between name and info.
(GRanges) object of ORFs mapped from fasta file. Positions are relative to the fasta file.
Other findORFs: findMapORFs
,
findORFs
, startDefinition
,
stopDefinition
# location of the example fasta file example_genome <- system.file("extdata", "genome.fasta", package = "ORFik") findORFsFasta(example_genome)