Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
CALIBRATING THE CLOCK: USING STOCHASTIC PROCESSES TO MEASURE THE RATE OF EVOLUTION 125 rate (m â 1) / 2 to each of the m genes and wait for one to ring. The probability that a mutation clock rings first is θ / (θ + m â 1), and, given that a mutation occurs first, the gene that mutates is chosen uniformly and at random. Similarly, the probability that a split occurs first is (m â 1) / (θ + m â 1), with the splitting gene being chosen at random from the m possibilities. The only wrinkle left is to describe the rule that tells us when to stop generating splits or mutations. In order to have the right distribution for the numbers of mutations when the sample has n ancestors, we must run until the first split after n, discard the last observation, and then stop. This simple scheme can be used effectively to simulate observations from extremely complex mutation mechanisms using only Bernoulli random variables, and provides a way of generating and storing the effects of each of the mutations. Some examples are given in the following sections. Bottom-up The second scheme, which proves very useful for deriving recurrence relations for the distribution of allele configurations, is the "bottom-up" method. In this case, the idea is to use the exponential alarm clocks from the bottom of the tree (that is, beginning at the sample) and run up to the common ancestor at the top. If we look up from the sample of size n toward the root, the probability that we will encounter a mutation before a coalescence is θ / (θ + n â 1), and the probability that a coalescence will occur first is (n â 1) (θ + n â 1). The probability distribution of the configuration at the tips may then be related to the distribution of the configuration at the mutation or coalescence time. To illustrate how this works, consider the infinitely-many-alleles mutation structure. Suppose that the current configuration consists of counts a = (a1,a2,. . .,an) with an = 0, and let Pn(a) denote the probability of this configuration. If the first event in the past is a coalescence, then the configuration of n â 1 genes must have been (al,. . .,ajâ1,aj + 1, aj+1 â 1,aj+2,. . .,anâl)
CALIBRATING THE CLOCK: USING STOCHASTIC PROCESSES TO MEASURE THE RATE OF EVOLUTION 126 for some j = 1, 2,. . ., n â 2, and a gene in class j must be chosen to have an offspring. Since this last event has probability jaj + 1) / n â 1), the contribution to Pn a from such terms is (5.9) If, on the other hand, the first event in the past was a mutation, then the configuration must have been either (a1â 1,a2,. . .,ajâ2, ajâ1 â 1,aj + 1,aj+l,. . .,anâ1,0) and the mutation occurred to a gene in a j class, j = 3,4,. . .,n-1 (probability j(aj + 1) / n ), or (a1â 2,a2 + 1,a3,. . .,anâ1,0) and the mutation occurred to a gene in the 2 class (probability 2(a2 + 1) / n),or (a1,. . .,anâ1,0) and the mutation occurred to a singleton gene (probability a1 / n). Finally, the configuration could have been (a â 1,a2,. . .,anâ2,an â1â1,1) and the mutation occurred in the n class (probability 1). Combining all these possibilities and adding the term in (5.9) gives