get.motifs.by.gene {KEGGSOAP} | R Documentation |
This function queries the Pfam, TIGRFAM, PROSITE pattern, and/or PROSITE profile databases for the motifs of a given gene. A motif is a locally conserved region of a sequence or a short sequence pattern shared by a set of sequences
get.motifs.by.gene(genes.id, db)
genes.id |
genes.id a character string for the id used by
KEGG to represent the gene of interest. The id normally consists of
three letters followed by a colon and then several numbers. The
three letters are from the first letter of the genus name and the
first two letters of the species name of the scientific name of the
organism of concern (e. g. hsa:111 for Homo Sapiens) |
db |
db a character string for the name of the data to
search for motifs. Valid database names include pfam, tfam, pspt,
pspf for the Pfam, TIGRFAM, PROSITE pattern, and PROSITE profile,
respectively, or all for all the four databases |
The motif ids obtained can be used to search for the genes that
contain the motif across organism using get.genes.by.motifs
The function returns a list of lists with each of the sub-list having the following elements:
motif_id |
a character string for the id of the motif found |
definition |
a character string for the definition of the motif |
genes_id |
a character string for the KEGG genes_id of the gene that contains the motif and used to search the database(s) |
start_position |
an integer for the start position of the motif match |
end.position |
an integer for the end position of the motif match |
score |
a numeric value for the score of the motif match for TIGRFAM and PROSITE databases |
evalue |
a numeric value for the E-value of the motif match for Pfam database |
Jianhua Zhang
http://www.genome.jp/kegg/soap/doc/keggapi_manual.html
if(require("SSOAP") && require("XML")){ motifs <- get.motifs.by.gene("eco:b0002", "pfam") sapply(motifs, function(x) x$motif_id) }