Complete nucleotide sequence of the rabbit β-like globin gene cluster: analysis of intergenic sequences and comparison with the human β-like globin gene cluster

JB Margot, GW Demers, RC Hardison - Journal of molecular biology, 1989 - Elsevier
JB Margot, GW Demers, RC Hardison
Journal of molecular biology, 1989Elsevier
The nucleotide sequence of the entire β-like globin gene cluster of rabbits has been
determined. This sequence of a continuous stretch of 44.5× 10 3 base-pairs (bp) starts about
6× 10 3 bp upstream from ε (the 5′-most gene) and ends about 12× 10 3 bp downstream
from β (the 3′-most gene). Analysis of the sequence reveals that:(1) the sequence is
relatively A+ T rich (about 60%);(2) regions with high G+ C content are associated with OcC
repeats, a short interspersed repeated DNA in rabbits:(3) the distribution of polypurines …
Abstract
The nucleotide sequence of the entire β-like globin gene cluster of rabbits has been determined. This sequence of a continuous stretch of 44.5 × 103 base-pairs (bp) starts about 6 × 103 bp upstream from ε (the 5′-most gene) and ends about 12 × 103 bp downstream from β (the 3′-most gene). Analysis of the sequence reveals that: (1) the sequence is relatively A + T rich (about 60%); (2) regions with high G + C content are associated with OcC repeats, a short interspersed repeated DNA in rabbits: (3) the distribution of polypurines, polypyrimidines and alternating purine/pyrimidine tracts is not random within the cluster; (4) most open reading frames are associated with known globin coding regions, OcC repeats or long interspersed repeats (LI repeats); (5) the most prominent open reading frames are found in the LI repeats: (6) different strand asymmetries in base composition are associated with embryonic and adult genes as well as the tandem LI repeats at the 3′ end of the cluster; and (7) essentially all the repeats appear to have been inserted by a transposon mechanism. A comparison of the sequence with itself by a dot-plot analysis has revealed nine new members of the OcC family of repeats in addition to the six previously reported. The OcC repeats tend to be clustered, particularly in the ε-γ and γ-ψδ intergenic regions. Dot-plot comparisons between the rabbit and the human clusters have revealed extensive sequence matches. Homology starts about 6 × 103 bp 5′ to ε or as far upstream as the rabbit sequence is available. It continues throughout the entire cluster and stops about 0.7 × 103 bp 3′ to β, at which point several repeats have inserted in both rabbits and humans. Throughout the gene cluster, the homology is interrupted mainly by insertions or deletions in either the rabbit or the human genome. Almost all of the insertions are of known short or long repeated DNAs. The positions of the insertions are different in the two gene clusters, which indicates that both short and long repeats have been transposing throughout the genome for the time since the mammalian radiation. An alignment of rabbit and human sequences allows the calculation of the substitution rate around ε. Sequences far removed from the gene are evolving at a rate equivalent to the pseudogene rate, although some short regions show an apparently higher rate. The fact that rate of divergence of the intergenic and flanking DNA is about the same as the rate of divergence of pseudogenes indicates that this rate is a good approximation of the neutral rate of evolution.
Elsevier