Re: Information: Brad's reply (was Information: a very
Greg Billock (billgr@cco.caltech.edu)
Wed, 1 Jul 1998 11:30:57 -0700 (PDT)Glenn,
> >When the system has dependencies like this (i.e. not every symbol is
> >independent of every other symbol), it is too aggressive to calculate
> >the information as 2*length (2 bits per base). The reason is because
> >there are long-range interdependencies in the genome.
>
> I think I see where we differ. The interesting thing about biological
> systems is that there is NO intersymbol dependencies. Thus while Shannon
> is obviously talking about a Markov transition maxtrix in which the
> symbols DO depend, and indeed in language the symbols do depend on previous
> choices, in biological systems this seems not to be the case. Quoting Yockey,
>
> "Intersymbol influence is an important repository of redundant information
> in written languages. In spite of considerable search of the protein text
> no intersymbol influence has been found. This source of redundance will
> have to be ignored until its magnitude and statistical structure is
> discovered-if it exists at all." ~ H. P. Yockey, "An Application of
> Information Theory to the Central Dogma and the Sequence Hypothesis,"
> Journal of Theoretical Biology, 46(1974):369-406, p. 384
>
> I believe this still holds, although I can't put my hand on a citation at
> this moment. Are we still in disagreement?
I don't know whether Yockey is talking about DNA structure or whether
he's talking about DNA decoding or about DNA-as-it-must-exist-in-organisms.
DNA structure is probably free of interdependence--DNA is stable.
DNA decoding is fairly free of interdependence (modulo some practical
and minimal constraints).
DNA as a whole most definitely has intersymbol constraints, although
they are complicated, and its structure is presently unknown. If Yockey
is talking about ignoring it when discussing information in DNA as a
whole, I fear his project is probably headed for disappointment, since
this is exactly the interesting part. My guess is, though, that he's
talking about decoding-context information.
-Greg