Re: ADDING OR LOSING GENETIC INFORMATION THROUGH MUTATION

Brian D Harper (bharper@postbox.acs.ohio-state.edu)
Sat, 25 Sep 1999 21:38:04 -0700

First let me express my appreciation to Lee for taking the time
to write this response and to Art for forwarding the questions
to Lee.

At 07:58 AM 9/24/99 -0700, Art wrote:
>Wesley asked:
>
>Could Art please post Spetner's information measure equation
>so that we can compare the two?
>
>>From Lee Spetner:
>
>>Thank you for forwarding me the questions that have arisen about how I
>>define and measure genetic information. The presumption is that unless I
>>quantify the information in a gene, I am not entitled to say, for any
>>mutation, whether the gene gains or loses information. I reject that
>>presumption.
>>

With all due respect, I have to reject that there is any presumption
in requiring a measure before making claims about whether a quantity
increases or decreases. This seems to me just common sense.

What I'm kind of suspecting at this point is that Lee is in reality
talking here about the quality of information rather than its
quantity. This is obviously confused by the fact that he mentions
later Shannon's care not to relate the quantity of information
to its meaning. Well, as I said, this is my suspicion. Let's
go on.

>>Before addressing the issue of quantifying the information in a gene, let
>>me point out that all the random mutations I discussed in my book (and by
>>extension, all known mutations whose molecular structures have been
>>examined) cannot serve as prototypes for the mutations that are supposed to
>>make up the long series of evolutionary steps claimed by neo-Darwinists to
>>have led to major evolutionary advances. They cannot serve as prototypes of
>>the mutations in the steps that are supposed to have led from a single cell
>>to an insect, from a fish to a mammal, and so on. Most of these mutations
>>are single-nucleotide substitutions that disable a control gene. Disabling
>>a gene cannot be a recipe for evolutionary advance. Although sometimes,
>>perhaps, a gene would have to be disabled in the course of evolving a new
>>enzyme, such disabling cannot represent a major portion of what has to
>>occur to achieve a new function. It cannot even represent a small fraction
>>of what must occur. Most mutations in a putative series of evolutionary
>>steps leading to a new species or a new order, class, or phylum, must add
>>to the genome the information necessary to achieve that advance. It should
>>be clear that information must be added to the genome to evolve a bacterium
>>into a human, or even into a fruit fly. One who insists that it is not
>>obvious that a human genome contains vastly more information than that of a
>>bacteriulm is a sophist.
>>

Let's be careful here. I would readily say that this is obvious,
but the reason I would say so is that I have in mind a definition
of information (either Shannon or Kolmogorov) which renders this
observation as obvious. Information is just a word. A word that,
unfortunately, has been taken to mean many different things. The
actual letters used to form this word are not important, what
you mean when you say the word is what is important. We can use
the word bloopy instead, to make sure we don't confuse it with
some other measure. Now, is it obvious that a human genome has
more bloopy than that of a bacterium? It all depends on what
bloopy means. As Hubert Yockey is fond of saying, one should not
draw meaning from the words themselves.

There are many ways of defining information. The only two I have
any familiarity with are Shannon entropy and Kolomogorov Complexity
(algorithmic information content). The mere definition of a
quantity which one calls "information" is no guarantee that the
quantity actually measures what we would normally think of as
information. One has to put the measure to some tests. Does it
measure what I think (hope :) it measures? Let me give two
examples.

In his book, Yockey repeats a proof given by Shannon in one of
his papers. As I mentioned in another recent post, my books are
all in boxes awaiting my move to another building. So I won't
be able to give details. Anyway, what Shannon did was write down
some very general attributes of an information measure. He then
proved that his measure has theses attributes and, in the process,
was also able to prove that his measure was the *only* measure
which satisfies those initial specifications. This is a very
powerful proof and illustrates what one can accomplish with a
mathematical theory of information.

The second example is also from Yockey's book. Yockey was able
to show that the genetic information system is isomorphic with
communications theory. The importance of a result like this cannot be
overemphasized since it provides a fundamental justification for
applying the results of communications theory to the genetic
information processing system. Again, just because one uses the
magic word information doesn't mean it has anything to do with
biological information.

>>If no mutation that has been studied is of the type needed for
>>neo-Darwinian macroevolution, then there is no molecular evidence that
>>random mutations and natural selection can achieve that evolution. Sure, we
>>know many single-nucleotide substitutions that can lead to microevolution.
>>But there is no argument about microevolution. My argument is against the
>>premise that random mutation, even with the help of natural selection, is
>>the driving force behind an evolutionary advance from a primitive cell to
>>human beings. There is no genetic evidence for such a premise.
>>
>>I submit that one need not measure the information in a gene to know if a
>>particular mutation has added or subtracted information. There is no
>>general way of measuring the information in a single message without
>>relating it to the ensemble of messages from which it was chosen.
>>Similarly, there is no general way of measuring the entropy in a single
>>message without relating it to the ensemble of messages of which it is a
>>member. Shannon was careful to avoid relating the information measure he
>>was defining to the meaning contained in a message. The communication
>>engineer must build a communication channel that will faithfully transmit a
>>message regardless of how much meaning the customer attaches to that message.
>>

Yes, but this seems to undercut what you say later.

>>There is no adequate definition of the information in a message without
>>relating it to the ensemble of messages that could have been sent. Thus I
>>cannot expect to measure the information in an arbitrary paragraph of
>>English text. Nor can I expect to measure the information in a section of a
>>genome.

Perhaps someone more proficient in information theory can help out
here, but according to my understanding, both of the above are
incorrect. I seem to remember there being some sources on the
web, for example, that allow one to insert text and then have
the Shannon entropy of the text calculated for you. Also, in his
book, Yockey calculates the information content of iso 1 cytochrome c.

An estimate of the Algorithmic information content is much easier
to obtain. Take your text and save it to your hard drive. Then
use a compression algorithm to compress the file. Then look at
the size of the compressed file.

>>But whatever the information in a paragraph of text, if I struck
>>out one or more sentences, I can be sure that I have not increased the
>>information. Rather, I can confidently say that I have decreased the
>>information.

An interesting example. I would say that, more than likely,
the information density (bits per symbol) would be unchanged.
This is where I believe Lee is starting to rely on the quality
(meaning) of the information rather than its quantity. This becomes
even clearer in the next sentence.

>>(I exclude the case in which the paragraph was nonsense and
>>didn't contain any information to begin with. In such a case the
>>information was zero both before and after I struck out the sentences.)

OK, this statement would be true only if information is qualitative
(what does it mean) instead of quantitative (how much is there,
regardless of what it means).

>>This example shows that indeed one can sometimes determine whether a change
>>in a message has decreased the information without having quantified the
>>information of the original message.
>>

Again, I mean no disrespect by any of my comments here, but IMHO
what the example really shows is why Shannon cautioned us about
trying to attach meaning to the quantity of information.

Once again, thanks to Lee for his contributions to the group.

Skipping the rest........

Brian Harper | "If you don't understand
Associate Professor | something and want to
Applied Mechanics | sound profound, use the
The Ohio State University | word 'entropy'"
| -- Morrowitz