IX.4.5 The excessively wide confidence intervals in the dating of cladogenetic events are the major drawback of the molecular clock
It is possible to calculate the most probable time that has elapsed since a certain event in cladogenesis, i.e. generally from the moment of divergence of two taxa, on the basis of suitably selected molecular data.However, the fixation of mutations during evolution is a stochastic process and thus all time estimates based on the numbers of substitutions are statistical estimates and are accompanied by a relatively large statistical error.Mathematical models that are used to calculate the most probable time intervals simultaneously allow us to quantify the accuracy of the given estimate.This accuracy is generally expressed in terms of reliability intervals, i.e. the length of the time interval encompassing the required value, e.g., with 95% probability.Unfortunately, these reliability intervals are generally rather broad for dating of cladogenetic events using a molecular clock.In general, it holds that, the more data that is available, i.e. for example, the longer the DNA section sequenced and included in the analysis, and the more recent the evolutionary event that we wish to date using the molecular clock, the shorter is the reliability interval obtained and the more precise is the dating of the cladogenetic events.If the dating is based on the sections of genes that contain a large number of substitutions, e.g. pseudogene sections, more differences are found amongst the studied species and thus more data is obtained.This is advantageous from the perspective of the precision of the data.On the other hand, the estimate of the number of substitutions that actually occurred in the given section, from the number of differences currently existing in the sequences of species of the studied taxon, becomes less precise.If the number of differences between the analyzed sequences is too large, a great many substitutions must have occurred repeatedly in the same position (Fig. IX.13).In this case, it is difficult or even impossible to precisely
Fig. IX.13. Substitution saturation. The graph indicates that transitions accumulate more rapidly than transversions in the gene for the mitochondrial protein COII. However, after a certain time, they begin to occur repeatedly at the same sites, so that the dependence between the number of differences between two sequences (ordinate) and the estimated time of divergence of the relevant species (abscissa) disappears. In contrast, transversions accumulate in the relevant gene much more slowly but can potentially exist in twice the number. Consequently, during the entire studied period of evolution of bovids, the number of transversions increased linearly with time. Data according to Janecek et al. (1996), modified according to Page and Holmes (2001).
estimate the actual number of substitutions.The optimal approach from the perspective of the precision of dating events in cladogenesis using the molecular clock consists in the use of data obtained by sequencing very long sections of relatively conservative genes.However, this approach is financially and temporally very demanding and is thus currently rarely used in practice.s