[...]
>
>SJ quoting Bradley, Thaxton:====
>>"Information in this context means the precise determination, or
>>specification, of a sequence of letters. We said above that a
>>message represents `specified complexity.' We are now able to
>>understand what specified means. The more highly specified a thing
>>is, the fewer choices there are about fulfilling each instruction.
>>In a random situation, options are unlimited and each option is
>>equally probable.
>
>Oops.
>
I thought I might elaborate some on my oops. B&T's use of the
word random reminded me of our recent discussion of Denton's
use of the word. Actually, I could have pounded on Steve some
more at this point since in another of his posts he seemed to
be agreeing with Dawkin's statement about how bad the
equi-probable assumption is. But I restrained myself because
the assumption seems so common I was thinking the idea must
be coming from somewhere, perhaps from some aspect of
probability theory. Also, rethinking my own thinking on this I
found I had been making the same mistake myself not so long
ago.
So, I decided to look up the definition of random in several of
my books selected at random (but not with equal probability :).
To my surprise, I found I had no textbook on probability theory,
so perhaps someone else could look for a definition from a
probability text. I do have many books on information theory
and these almost always have an introduction to probability
theory in them. In these books random was always defined as
an unpredictable outcome which occurred according to some
probability distribution. No mention in the definition of the
special case where the events are equi-probable.
Next I looked in my dictionary (yes, I know, bad idea :). My
dictionary had a subsection for usage of random in statistics
with 4 definitions. The third definition was:
"Of or designating a sample drawn from a population so
that each member of the population has an equal chance
to be drawn." -- American Heritage Dictionary
Obviously, this definition agrees very well with B&T's and Denton's
use of "random".
The last place I looked was in the humongous 6 volume Encyclopedia
of Mathematics. There were 13 pages describing various uses
of random in mathematics. Needless to say, I just skimmed over
this material briefly looking for some mention of equally probable
events. A section entitled "Random Event" included the dice throwing
example I mentioned elsewhere. The 36 ordered pairs are
equi-probable but the random events for the sum are not
equi-probable. The only thing I could find close to the case
where equi-probable is part of the definition of random was
"Random Allocation - A probability scheme in which n
particles are randomly distributed over N cells. In the
simplest scheme, the particles are distributed equi-probably
and independently of one another ........."
Now I want to discuss the implications of this with respect
to probability calculations in the origins debate. I can't
recall ever seeing a probability calculation by a creationist
which did not assume equal probabilities. Sometimes the
assumption is not stated explicitly, but one can tell by the
numbers calculated that the assumption was made. I don't
want to pick only on creationists here though. Most of their
calculations (that I've seen anyway) can also be found in the
literature (the old! literature, usually :). Actually, the only
probability calculation which I can remember that did not use
equal probabilities is Yockey's calculation.
Ah, I think I just figured out why this error is so common.
Most probability calculations have to do with constructing
protein sequences by chance. Related to this would be
the probability of executing a long sequence of events in
some specified order.
To illustrate, suppose we want to construct a sequence
M units long from an alphabet containing S characters.
What is the total number of possible sequences (N)
of length M? This one's easy N = S^M. Now, whats
the probability of selecting any one of these sequences
at random? If the probability of selecting any one of them
is the same then this probability is 1/N.
I think this is the point at which one is typically thinking
about equal probability. If these sequences supposedly
form by chance then why give preference to any one of
them? And if some are more probable than others,
how could we know which ones and how could we know
whether the specific one we're interested in is one of
the more probable ones or one of the less probable.
Given this uncertainty, isn't the fairest thing just to
assume they're all equally probable?
So, the confusion comes from talking about two different
random processes. The random process wherein individual
letters are selected from S letters according to some
probability distribution p and then the random process
of selecting one of the N possible sequences of length M.
When saying that the equi-probable assumption is bad
I've generally been talking about the assumption that the
individual letters appear with equal probability. So, this
raises an interesting question. It doesn't seem that the
probability distribution for the occurrence of the letters
has anything to do with the total number of sequences
that are possible. Isn't this independent of those
probabilities, and if so wherein lies the error?
Now, Glenn brings up regularly a problem with the above
reasoning. In computing the probability as 1/N one is
assuming, in addition to equal probabilities, that one and
only one sequence will do. Yockey also considers this and
provides an estimate that there are about 10^93 functionally
equivalent cytochrome c molecules. Of course, 10^93 is
somewhat larger than 1 ;-). But this is a different error
than the equi-probable error so I'm not going to pursue
it any more here.
Now, some may have noticed that my thinking above contains
a flaw. I went through that exercise to try to understand why the
equi-probable assumption is so common. I believe it may be
due to the apparent lack of connection between the probability
distribution for the letters and the total number of sequences.
The two random selections above are actually related since
one does not actually have a collection of all the possible
sequences of length M from which one selects one sequence.
Instead, a specific sequence is selected by adding letters
one by one according to the probability distribution.
the probability of generating a sequence for the case of
unequal probabilities. Actually, it is possible to do it using
a really amazing result called the Shannon-McMillan theorem.
Yockey likes to refer to the result with paradoxical statements
like "The number of messages in a sequence is far less than
the total number of possible messages". Above, we were
wondering what effect the probability distribution could
have on the number of messages. The answer is surprising.
It has a HUGE effect. The Shannon-McMillan theorem goes
roughly like this. The total number of possible sequences
can be divided into two groups, a low probability group and
a high probability group. The *sum* of probabilities of *all*
sequences in the low prob. group is essentially zero (in
math language, this sum is equal to epsilon where epsilon
is greater than zero but arbitrarily small, we can make it
as small as we want just not 0). One might think at this
point that only a small proportion of the sequences are
in this low prob. group. It turns out, if we have unequal
probabilities, that almost all of them are in this group!
For example, I did a little calculation for an alphabet with
10 letters and a sequence of length 100 and found that
99.99999999999999% of the possible sequences are in the
low probability group. To say it another way, the sum of
the probabilities for 99.99999999999999% of all the sequences
is essentially zero. Another interesting result is that
the probability of any member of the high probability group
can be calculated easily from the Shannon entropy H,
p = 2^(-MH)
where M is the sequence length.
The upshot of all this is that the probability calculations one
is used to seeing are greatly in error for (at least :) two reasons.
(1) they assume one and only one result will work and (2) they
use the total number of possible sequences instead of the
number of sequences in the high probability group.
Wow, I had intended this to be just a short note. Wasn't it
Pascal who said something like: "Sorry that this letter is so
long, if I had more time I would have written a shorter one" ;-).
Brian Harper
Associate Professor
Applied Mechanics
Ohio State University