>One "problem" with this definition is that there is an inherent
>uncertainty, one can never be sure that one has not missed
>some principle for compressing the message. Expressing the
>change in info-content in terms of the frequencies of the codons,
>as proposed by the authors of the study we are discussing,
>seems to me to recognize no principles of compression. How
>would we go about recognizing a compression method for DNA?
>One way might be to recognize that "message" in DNA involves
>amino acid sequences in protein. If we realize this then the
>original DNA sequence can be compressed according to the
>observation that the last position in eight codons is irrelevant.
>Thus the "information" in the uncompressed DNA might change
>due to a point mutation in the third position of these codons,
>but the compressed length (and thus the algorithmic information
>content) would not change.
>
I am going to throw another means of compression which apparently is rare but
has apparently been observed. Overlapping genes! (Edward Yoxen, The Gene
Business,Oxford U. Press, 1983, P.107) says that some virus's have overlapping
genes. This is truly a compressed message and I am not sure how information
theory could deal with this. An audio example of overlapping messages is;
"The seamlessness of speach is also apparent in
'oronyms,' strings of sound that can be carved into words in two different
ways:
The good can decay many ways.
The good candy came anyways.
The stuffy nose can lead to problems
The stuff he knows can lead to problems
Some others I've seen.
Some motherse I've seen."
~Steven Pinker, The Language Instinct, (New York: Harper/Perennial, 1994), p.
159-160
One string of sounds but two messages. Information theory has no way that I
am aware of for describing this type of situation. Do you know of one?
glenn
Foundation,Fall and Flood
http://members.gnn.com/GRMorton/dmd.htm