On Tue, 16 Jun 1998 17:05:07 -0400, Brian D Harper wrote:
[...]
BH>First note that as the file size increases the %
>compressibility also increases and approaches
>100%, though it will never get to 100% of course.
>
>The important point though is that the information
>content is given not by % compressibility but
>rather by the size of the compressed file. Even
>though the compressibility is approaching 100 %,
>the information content continues to increase.
>(see second two columns)
[...]
My answer to this was as follows:
------------------------------------------------------------------------------
On Tue, 23 Jun 1998 22:18:48 +0800, Stephen Jones wrote:
>
>Thanks again. I am now convinced that the "information" you
>are talking about in Information Theory is not necessarily the
>same as what the layman and biologists are talking about,
>namely meaning.
------------------------------------------------------------------------------
Now as a follow up, in support of my claim above, that the
"information" of Information Theory is not necessarily the same as
the "information" of Biology, consider that Dawkins in his "The Blind
Watchmaker," has a model which claims to simulate Neo-Darwinian
random mutation and cumulative natural selection, by making random
changes to a initial 28-letter gibberish phrase and selecting those
`mutations' which match a target phrase. This should be a good test
of whether `information' (in the Information Theory sense), can be
built up by random mutations and cumulative selection.
Here's how Dawkins sets it up:
"So much for single-step selection of random variation. What about
cumulative selection; how much more effective should this be? Very
very much more effective, perhaps more so than we at first realize,
although it is almost obvious when we reflect further. We again use
our computer monkey, but with a crucial difference in its program. It
again begins by choosing a random sequence of 28 letters, just as
before: WDLMNLT DTJBKWIRZREZLMQCO P* It now 'breeds
from' this random phrase. It duplicates it repeatedly, but with a
certain chance of random error - 'mutation' - in the copying. The
computer examines the mutant nonsense phrases, the 'progeny' of the
original phrase, and chooses the one which, however slightly, most
resembles the target phrase, METHINKS IT IS LIKE A WEASEL.
In this instance the winning phrase of the next 'generation' happened
to be: WDLTMNLT DTJBSWIRZREZLMQCO P Not an obvious
improvement! But the procedure is repeated, again mutant 'progeny'
are 'bred from' the phrase, and a new 'winner' is chosen. This goes on,
generation after generation. After 10 generations, the phrase chosen
for 'breeding' was: MDLDMNLS ITJISWHRZREZ MECS P After
20 generations it was: MELDINLS IT ISWPRKE Z WECSEL By
now, the eye of faith fancies that it can see a resemblance to the
target phrase. By 30 generations there can be no doubt: METHINGS
IT ISWLIKE B WECSEL Generation 40 takes us to within one letter
of the target: METHINKS IT IS LIKE I WEASEL And the target
was finally reached in generation 43."
[* Note: This is a misprint (deleterious mutation? :-). Should be
WDLDMNLT DTJBKWIRZREZLMQCO P as originally given by
Dawkins at the start of the page -- SEJ]
(Dawkins R., "The Blind Watchmaker," 1991, pp47-48).
I made text files each containing 100 lines of each of Dawkins' 28-
letter phrases. I then compressed them with WinZip (6.3 SR-1 under
Windows 95c). Here are the results:
Gen Phrase
1 WDLDMNLT DTJBKWIRZREZLMQCO P
2 WDLTMNLT DTJBSWIRZREZLMQCO P
10 MDLDMNLS ITJISWHRZREZ MECS P
20 MELDINLS IT ISWPRKE Z WECSEL
30 METHINGS IT ISWLIKE B WECSEL
40 METHINKS IT IS LIKE I WEASEL
43 METHINKS IT IS LIKE A WEASEL
original compressed compress-
file size file size sibility
Gen (bytes) (bytes) (%)
1 2,998 179 94.029%
2 2,998 179 94.029%
10 2,998 178 94.063%
20 2,998 177 94.096%%
30 2,998 177 94.096%%
40 2,998 178 94.063%
43 2,998 178 94.063%
The graph of the above results looks like this:
Compressed
File
Size
179x x
178 x x x
177 x x
--------------------------------------------------
1 2 10 20 30 40 43
Generations
Note that the file size starts off with the longest compressed file
(most information-according to Brian). So far so good. The phrase
itself is meaningless: WDLDMNLT DTJBKWIRZREZLMQCO P.
But then after 20 generations of random mutation and cumulative
natural selection, as the phrase steadily gains meaning as it
approaches the target phrase METHINKS IT IS LIKE A WEASEL,
the file size declines (ie. information is lost).
After another 10 generations of mutation and selection, in which the
phrase continues to approach the target phrase, the file size shows no
increase or decrease of information. This is despite the phrase
becoming more meaningful: METHINGS IT ISWLIKE B WECSEL.
But then, after 40 generations of randon mutation and cumulative
selection, with the phrase just one letter short of the target phrase, ie.
METHINKS IT IS LIKE I WEASEL, the file size increases only
back to where it was after 10 generations!
Finally, when the last right letter is selected, the compressed file size
(and hence information), is less than where it started! Yet the
meaning has become greatest, having reached the target phrase:
METHINKS IT IS LIKE A WEASEL.
Just to double-check, I multipled Dawkins' phrases ten-fold so that
each file had 1,000 lines of same. The results using WinZip were as
follows:
original compressed compress-
file size file size sibility
Gen (bytes) (bytes) (%)
1 29,998 247 99.177%
2 29,998 247 99.177%
10 29,998 245 99.183%
20 29,998 245 99.183%
30 29,998 244 99.187%
40 29,998 243 99.190%
43 29,998 244 99.187%
The new graph looks like this:
Compressed
File
Size
247x x
246
245 x x
244 x x
243 x
--------------------------------------------------
1 2 10 20 30 40 43
Generations
This looks even worse! If anything, compressed file length
Information Theory `information' is either inversely related or
unrelated to Neo-Darwinian `information' which is claimed by
Dawkins to be built-up by random mutation and cumulative natural
selection!
Therefore, if you are right about information content being positively
related to compressed file size, the only conclusion that can be drawn
from this is that if random mutations *can* add `information' (in an
Information Theory sense), then it is not the same `information' (in a
Neo-Darwinian sense), that Dawkins purports to be modelling in his
METHINKS IT IS LIKE A WEASEL simulation.
Or, to put it another way, if you are right and Dawkins' METHINKS
IT IS LIKE A WEASEL model is a faithful representation of Neo-
Darwinian evolution, then Neo-Darwinian random mutation and
natural selection actually leads to information loss!
This in fact is the very point that Spetner makes:
"Information theory, which was introduced as a discipline about half a
century ago by Claude Shannon, has thrown new light on this
problem. It turns out that random variation cannot lead to large
evolutionary changes. The information required for large- scale
evolution cannot come from random variations. There are cases in
which random mutations do lead to evolution on a small scale. But it
turns out that, in these instances, no information is added to the
organism. Most often, information is lost." (Spetner L.M., "Not by
Chance!: Shattering the Modern Theory of Evolution," Judaica Press:
New York, 1997 revised, p.vii)
Therefore it seems that Glenn cannot simultaneously claim support
from a) yourself Brian; b) Dawkins; and 3) Information Theory;
because one or more of these must be wrong, unless they are using
the word `information' in different senses.
Steve
"Evolution is the greatest engine of atheism ever invented."
--- Dr. William Provine, Professor of History and Biology, Cornell University.
http://fp.bio.utk.edu/darwin/1998/slides_view/Slide_7.html
--------------------------------------------------------------------
Stephen E (Steve) Jones ,--_|\ sejones@ibm.net
3 Hawker Avenue / Oz \ Steve.Jones@health.wa.gov.au
Warwick 6024 ->*_,--\_/ Phone +61 8 9448 7439
Perth, West Australia v "Test everything." (1Thess 5:21)
--------------------------------------------------------------------