Stepwise mutation model


The stepwise mutation model is a mathematical theory, developed by Motoo Kimura and Tomoko Ohta, that allows for investigation of the equilibrium distribution of allelic frequencies in a finite population where neutral alleles are produced in step-wise fashion.

Description

The original model assumes that if an allele has a mutation that causes it to change in state, mutations that occur in repetitive regions of the genome will increase or decrease by a single repeat unit at a fixed rate and these changes in allele states are expressed by an integer. The model also assumes random mating and that all alleles are selectively equivalent for each locus. The SMM is distinguished from the Kimura-Crow model, also known as the infinite alleles model, in that as the population size increases to infinity, while the product of the Ne and the mutation rate is fixed, the mean number of different alleles in the population rapidly reaches a peak and plateaus, at which time that value is almost the same as the effective number of alleles.
Differences in the length of "simple sequence repeats" between individuals can thus be used to construct phylogenies or determine genetic distance between groups of individuals. For example, more genetically distant individuals would show larger differences in the size of SSRs than more closely related individuals. Given the underlying assumptions of the SMM, it has been widely adopted for use with microsatellite markers that contain repeat regions, are co-dominate, and have high rates of mutation.
The original SMM has been modified in multiple ways, including:
  1. taking into account the upper size limit to most microsatellites
  2. factoring in the likelihood of large alleles to show higher rates of mutation than small alleles
  3. and including variations that suggest that mutations are split between point mutations that disrupt stretches of repeats and the additions or removal of repeat units. This last assumption provides an explanation for why microsatellites do not evolve into enormous arrays of infinite size.
A number of summary statistics can be used to estimate genetic differentiation using the SMM model. These include number of alleles, observed and expected heterozygosity, and allele frequencies. The SMM model takes into account the frequency of mismatches between microsatellite loci, meaning the number of times there are no mismatches, single mismatches, 2 mismatches, etc. Variance in allele sizes are used to make inferences about the genetic distance between individuals or populations. By comparing summary statistics at different levels of organization it is possible to make inferences about population histories. For example, we can examine the variance of allele size within a subpopulation as well as within the total population to infer something about population history.
Construction of phylogenies under the SMM is, however, complicated by the fact that it is possible to either gain or lose a repeat unit, thus alleles that are identical in size are not necessarily identical by descent. Therefore the SMM cannot be used to determine the exact number of mutational events between two individuals. For example, individual A might have gained a single additional repeat whereas individual B might have lost a single repeat, resulting in both individuals with identical number of microsatellite repeats.
Some important caveats and limitations to consider when choosing molecular markers for estimating the relatedness of individuals or distinguishing between populations include the following:
  1. There are limitations associated with various marker types and the number of markers used can heavily influence analytical results.
  2. Molecular markers provide only a “sample” of the genetic information in which to compare individuals of populations, and can differ from actual genetic differentiation. For example, it is possible that two individual are identical at a given locus, having the same mutation even from its common ancestor, but could differ at other loci that were not observed.