David Campbell wrote:
> >> In general, the vast majority of new genes seem to be produced from manipulation of existing genes-mixing and matching parts, duplicating and then modifying, etc....<<
>
> >The origin of a new protein by exon shuffling may also be considered a cooptation of a set of preexisting functionalities. This also applies to duplicates of genes happening to already possess an initial minimal activity of a new kind. In such cases, selection is possible from the start. This is microevolution and does not pose any informational problems.<
>
> Micro and macroevolution are defined in too many different ways for me to be sure what you mean here. Exon shuffling and gene duplication followed by modification both produce novel information, something some intelligent design advocates claim is impossible. I don't think you are trying to support such claims, but I would see the combination of parts from several different genes followed by selection for a new function as a relatively substantial innovation.
>
David:
I agree that this is a relatively substantial innovation. But,
nevertheless, I would consider the amount of novel information gained to
be relatively small. ("Information" has even more different meanings
than "micro-" and "macroevolution"!) I would justify this claim as
follows: After a single nucleotide mutation, the mutant and the
wild-type are subject to natural selection, whose "answer" to the
mutation is "yes" or "no" or something in-between, i.e. at most 1 bit of
information. The same consideration applies to any more complex
mutation, such as a new gene composed of shuffled exons: as far as
natural selection is concerned, the gain of information from the
environment is at most 1 bit. If this seems counter-intuitive, we must
ask whether this new construct was produced in a single step, such as an
unequal crossing-over. If yes, then it was a simple step, like a simple
mutation or deletion. If it required a series of coordinated steps, the
intermediates in this path probably were not under any selection, and
the probability of end product formation may have been extremely small.
> >But to assume that ALL functionalities emerged in such a manner, without any non-selectable intermediates, is entirely speculative. How do you know this is "the vast majority" of genes? You yourself concede that the origin of "the first gene" is not dealt with. There are an estimated 1000 different protein folds (each grouping a series of protein families or superfamilies) in the biosphere, considering the globular, water-soluble proteins only (Y.I.Wolf, N.V.Grishin, E.V.Koonin, "Estimating the number of protein folds and families from complete genome data", J.Mol.Biol. 299 (2000), 897-905). Almost by definition, these 1000 folds are not related to each other by exon shuffling and gene duplication. Each one of them had to originate somewhere at least once during the past 3.8 billion years. Thus, it would be more realistic to talk about "the first 1000 genes" whose emergence cannot be accounted for at present. These are the cases I am considering when I talk about a mutati!
on!
al random walk without intermediate selection until a minimal selectable activity happens to be produced. These are cases I consider macroevolutionary steps posing considerable informational problems deserving careful attempts at estimating their probability and at possibly finding more realistic evolutionary scenarios than merely assuming that "it must have happened somehow" through selectable intermediates. You may call these the most elementary cases of Behe's "irreducibly complex systems" - whose non-existence has not yet been made plausible.
> <
>
> Obviously, examining every known gene sequence to determine the relative frequency of egene duplication, exon shuffling, and the like is not feasible. However, the general pattern that emerges as one examines a gene, one finds related genes with different functions. If there are 1000 truly novel genes, that is still a lot less than the total number of genes in humans, for example. I did not mean to imply that all functions evolved by duplication and modification of existing genes, but rather that it was extremely common.
>
If each selected mutational step adds 1 bit of information from the
environment to a genome, the biosphere can collect quite a lot of
information from the environment. But how about the "truly novel genes"?
Their minimally active form must have arisen by truly random-walk
mutagenesis. Of which type of information - step-by-step selected or
random-walk generated - is there more in the biosphere? I think we don't
know. But what I am getting at is the challenge of the random-walk type.
Even if this concerns only a few percent of all existing genes, it poses
a big problem, as darwinian evolution cannot be invoked. Don't you think
so?
> The example of a pseudogene reactivated, discussed in other posts, would be a case of passing through unselected "random" intermediates before arrving at a useful function.
>
Yes, and this is exactly one of the interesting cases. Do you know of
any case where such a path via unselected intermediates has been
documented in a real biological system, not just stated as a general
hypothesis? I am eager to find such cases!
> However, I doubt that no two protein folds can be produced from sequential functional modifications of a gene or genes. Taking another example from Graur and Li, an antifreeze protein in one Antarctic fish is derived from a particular enzyme by deletion of most of the enzyme and internal serial duplication of a 3 amino acid sequence. I do not know the folding pattern of the trypsinogen and the antifreeze protein, but I supect they are rather different.
>
It would be interesting to synthesize the supposed intermediates and
test them for functionality. But, as with crystallins, an antifreeze
protein presumably doesn't require any particular biochemical
specificity, just some rather general physicochemical properties.
Peter
This archive was generated by hypermail 2b29 : Wed Oct 25 2000 - 11:37:25 EDT