The statement about 99% of the DNA being similar is approximately correct.
While having the complete gene sequences will be helpful, the data from
DNA-DNA hybridization experiments is easier to obtain and is probably just
as accurate.
DNA is purified from human tissue and from tissue from the ape of interest.
These are then used in two types of experiments. In one type, human DNA is
denatured into single strands by heating, then the strands are allowed to
come back together at a somewhat lower temperature. The rate at which it
comes back into its normal, double stranded structure is measured. The
same experiment is repeated, using ape DNA. Finally, ss ape DNA and ss
human DNA are allowed to renature, forming ds hybrid DNA. The rate at
which the hybrid forms, compared to the rate of reformation of the two
original types (ape and human) tells us how similar the two types of DNA
are. This method is called CoT analysis or reassociation kinetics
analysis.
The second method uses hybrid ape-human DNA and measures how tightly the
two different strands are held together. First, human DNA is slowly
heated to measure its "melting temperature," the temperature at which the
two strands come apart. This is then done with the ape DNA. Finally,
it is done with ape-human hybrid DNA, noting the difference in the melting
temperatures. The difference in melting temperatures is related to the
difference in the DNA sequences. The greater the difference in the melting
temperatures, the greater the sequence differences.
Both of these techniques utilize the ability of similar sequences on the
opposite strands of the DNA to recognize and bind with each other as the
method of measuring (estimating) sequence similarities. A more accurate
method will be to directly compare the DNA sequences when they are
published. We have some sequence data available now, like for cytochrome
c. Ape and human cyc protein sequences are identical.
A confounding factor in this discussion, however, is the presence of highly
repetetive DNA in both apes and humans. Much of both types of DNA is
composed of long regions of direct repeats of identical, short sequences.
If these repeat sequences are used in the similarity calculations, it will
make the similarities look artificially high. If one studies only the
unique-sequence DNA, the similarity numbers are lower. They are still
high, though (>95%).
I hope this helps.
Jim Behnke, Asbury College, Wilmore, KY 40390 jimbeh@ms.uky.edu