dist.dna {ape} | R Documentation |
These functions compute a matrix of pairwise distances from DNA sequences using a model of DNA evolution. Eight substitution models (and the raw distance) are currently available.
dist.dna(x, model = "K80", variance = FALSE, gamma = FALSE, pairwise.deletion = FALSE, base.freq = NULL, as.matrix = FALSE)
x |
a matrix, a data frame, or a list containing the DNA
sequences (the latter can be taken from, e.g.,
read.GenBank ). |
model |
a character string specifying the evlutionary model to be
used; must be one of "raw" , "JC69" , "K80" (the
default), "F81" , "K81" , "F84" , "T92" ,
"TN93" , or "GG95" . |
variance |
a logical indicating whether to compute the variances
of the distances; defaults to FALSE so the variances are not
computed. |
gamma |
a value for the gamma parameter which is possibly used to
apply a gamma correction to the distances (by default gamma =
FALSE so no correction is applied). |
pairwise.deletion |
a logical indicating whether to delete the sites with missing data in a pairwise way. The default is to delete the sites with at least one missing data for all sequences. |
base.freq |
the base frequencies to be used in the computations
(if applicable, i.e. if method = "F84" ). By default, the
base frequencies are computed from the whole sample of sequences. |
as.matrix |
a logical indicating whether to return the results as a matrix. The default is to return an object of class dist. |
As from ape 1.5, the interface and the computational part of this function have been completely rewritten.
The molecular evolutionary models available through the option
model
have been extensively described in the literature. A
brief description is given below; more details can be found in the
References.
variance
and gamma
have no effect, but pairwise.deletion
can.
an object of class dist (by default), or a numeric
matrix if as.matrix = TRUE
.
If variance = TRUE
an attribute called "variance"
is
given to the returned object.
Emmanuel Paradis paradis@isem.univ-montp2.fr
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17, 368–376.
Felsenstein, J. and Churchill, G. A. (1996) A Hidden Markov model approach to variation among sites in rate of evolution. Molecular Biology and Evolution, 13, 93–104.
Galtier, N. and Gouy, M. (1995) Inferring phylogenies from DNA sequences of unequal base compositions. Proceedings of the National Academy of Sciences USA, 92, 11317–11321.
Jukes, T. H. and Cantor, C. R. (1969) Evolution of protein molecules. in Mammalian Protein Metabolism, ed. Munro, H. N., pp. 21–132, New York: Academic Press.
Kimura, M. (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111–120.
Kimura, M. (1981) Estimation of evolutionary distances between homologous nucleotide sequences. Proceedings of the National Academy of Sciences USA, 78, 454–458.
Jin, L. and Nei, M. (1990) Limitations of the evolutionary parsimony method of phylogenetic analysis. Molecular Biology and Evolution, 7, 82–102.
McGuire, G., Prentice, M. J. and Wright, F. (1999). Improved error bounds for genetic distances from DNA sequences. Biometrics, 55, 1064–1070.
Tamura, K. (1992) Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G + C-content biases. Molecular Biology and Evolution, 9, 678–687.
Tamura, K. and Nei, M. (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution, 10, 512–526.
read.GenBank
, read.dna
,
write.dna
, dist.gene
,
dist.phylo
, dist