get.paralogs.by.gene {KEGGSOAP} | R Documentation |
Given a KEGG gene id, the function queries the KEGG Sequence Similarity Database (SSDB) for genes that are paralogous to the target gene. Paralogous genes result from duplication of existing genes and then function divergence
get.paralogs.by.gene(genes.id, start, max.results)
genes.id |
genes.id a character string for the id used by
KEGG to represent the gene of interest. The id normally consists of
three letters followed by a colon and then several numbers. The
three letters are from the first letter of the genus name and the
first two letters of the species name of the scientific name of the
organism of concern (e. g. hsa:111 for Homo Sapiens) |
start |
start an integer to indicate the location of the
entry in the query results from which the results will be
extracted and returned |
max.results |
max.results an integer to indicate the
maximum number of entries that will be extracted from the query
results and returned |
A given gene may have several paralogous genes. A
query to SSDB may have a list of genes that are paralogous to the
target gene. start
and max.results
indicate where on the
list to start and stop to extract data and return the results.
The function returns a list of lists. Each sub-list contains data for a gene that is paralogous to the target gene with the following elements:
genes_id1 |
a character string for the id of the target gene used to query for hologous genes |
genes_id2 |
a character string for the id of the homologous gene found in another organism |
sw_score |
an integer for Smith-Waterman score between genes_id1 and genes_id2 |
bit_score |
a numeric value for the bit score between genes_id1 and genes_id2 |
identity |
a numeric value between 0 and 1 for the degree of identity between genes_id1 and genes_id2 |
overlap |
an integer for the overlapping length between genes_id1 and genes_id2 |
start_position1 |
an integer for the start position of the alignment in genes_id1 |
end_position1 |
an integer for the end position of the alignment in genes_id1 |
start_position2 |
an integer for the start position of the alignment in genes_id2 |
end_position2 |
an integer for the end position of the alignment in genes_id2 |
best_flag_1to2 |
a boolean that is TRUE if genes_id2 is the best neighbor gene of genes_id1 |
best_flag_2to1 |
a boolean that is TRUE if genes_id1 is also the best neighbor gene of genes_id2 |
definition1 |
a character string for the definition of genes_id1 |
definition2 |
a character string for the definition of genes_id2 |
length1 |
an integer for the amino acid length of the genes_id1 |
length2 |
an integer for the amino acid length of the genes_id2 |
Jianhua Zhang
http://www.genome.jp/kegg/soap/doc/keggapi_manual.html
if(require("SSOAP") && require("XML")){ paraGenes <- get.paralogs.by.gene("eco:b0002", 1, 10) }