原文链接:http://evomics.org/resources/substitution-models/nucleotide-substitution-models/
The use of maximum likelihood (ML) algorithms in developing phylogenetic hypotheses requires a model of evolution. The frequently used General Time Reversible (GTR) family of nested models encompasses 64 models with different combinations of parameters for DNA site substitution. The models are listed here from the least complex to the most parameter rich.
Jukes-Cantor (JC, nst=1): equal base frequencies, all substitutions equally likely (PAUP* rate classification: aaaaaa, PAML: aaaaaa) (Jukes and Cantor 1969)
Felsenstein 1981 (F81, nst=1): variable base frequencies, all substitutions equally likely (PAUP*: aaaaaa, PAML: aaaaaa) (Felsenstein 1981)
Kimura 2-parameter (K80, nst=2): equal base frequencies, one transition rate and one transversion rate (PAUP*: abaaba, PAML: abbbba) (Kimura 1980)
Hasegawa-Kishino-Yano (HKY, nst=2): variable base frequencies, one transition rate and one transversion rate (PAUP*: abaaba, PAML: abbbba) (Hasegawa et. al. 1985)
Tamura-Nei (TrN): variable base frequencies, equal transversion rates, variable transition rates (PAUP*: abaaea, PAML: abbbbf) (Tamura Nei 1993)
Kimura 3-parameter (K3P): variable base frequencies, equal transition rates, two transversion rates (PAUP*: abccba, PAML: abccba) (Kimura 1981)
transition model (TIM): variable base frequencies, variable transition rates, two transversion rates (PAUP*: abccea, PAML: abccbe)
transversion model (TVM): variable base frequencies, variable transversion rates, transition rates equal (PAUP*: abcdbe, PAML: abcdea)
symmetrical model (SYM): equal base frequencies, symmetrical substitution matrix (A to T = T to A) (PAUP*: abcdef, PAML: abcdef) (Zharkikh 1994)
general time reversible (GTR, nst=6): variable base frequencies, symmetrical substitution matrix (PAUP*: abcdef, PAML: abcdef) (e.g., Lanave et al. 1984, Tavare 1986, Rodriguez et. al. 1990)
In addition to models describing the rates of change from one nucleotide to another, there are models to describe rate variation among sites in a sequence. The following are the two most commonly used models.
gamma distribution (G): gamma distributed rate variation among sites
proportion of invariable sites (I): extent of static, unchanging sites in a dataset
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!