I've made a lot of progress on this subject. I think it's your turn to make
a little:-)
You wrote, quoting me,
>>"So it takes about N = 1/p generations to have a large number of different
>>mutants running around. That's 10,000 generations of humans, or 200,000
>>years. Your estimate of 1.2 million years is a little too high for the
>>following reason. You assume it takes 10,000 generations 1/p to get the first
>>mutation in, and then another 10,000 generations to get the second one in,
>>etc.. But, in fact you have waited until a1/a0=1, i.e. 1/2 of the population
>>has suffered a mutation until you allow a second mutation to start"
Then you replied,
>I am very curious about what more you have to say. But the only problem I
>have is that if I understand your equation here, you are calculating the
>frequency of people with one mutation in the population. But only people
>with the first mutation can get the second mutation in addition to the first
>which may be necessary for the divergence of the gene. The portion of the
>population without any mutation can not diverge further.
Oops! I think I pulled a switch on you. After getting you to clarify your
use of the word *generation*, by which you meant parent-child, I have
proceeded to use the word to mean all the children of all the parents. I'm
not calculating what happens in a *typical* geneology (that's what you did),
but the relative frequency of the mutations, including multiple mutations,
in the whole population. That's what my Ak is. (If you care, the problem is
a standard Poisson process, and my result is standard Poisson statistics).
Let me give some specific numbers. After 10,000 generations, i.e. 200,000
years (N=1/p), for every 720 copies with no substitutions, A0 = 720, we have:
720 copies with 1 substitution, A1 = 720
120 copies with 2 substitutions, A2 = 120
24 copies with 3 substitutions, A3 = 24
6 copies with 4 substitutions, A4 = 6
2 copies with 5 substitutions, A5 = 2
1 copy with 6 substitutions, A6 = 1
Notice the factorial progression. I could end up with a small number of
copies that contain a large number of substitutions, because even though p =
10^-4, this is only a probability; some geneologies contain substitutions
more often than 1 every 10,000 generations, some less. A rather large number
of substitutions take place before the 10,000th generation and some of these
will go on to give rise to copies containing multiple substitutions before
we reach 10,000 generations. Admittedly, those having a large number of
substitutions are rarer than those with a small number or zero.
Glenn, I want you to agree to this math (or at least its reasonableness).
The more I have to say is based on its interpretation.
If the mutation rate were, in fact, a factor of 10 higher than Gish's
number, then there's no problem with a bottleneck 20,000 years ago. Someone
(I forget who) pointed out that we're dealing with a complicated system with
an abnormally high number of alleles. I need hard numbers on just how many
substitutions we really observe, and what are their relative frequencies in
the population, and a better estimate of p than Gish's order of magnitude
guess before I can accept your argument as being strong.
Jim
Jim Blake
Associate Professor
Department of Electrical Engineering
Texas A&M University
College Station, TX 77843