Recently, blogger and trend-seeker-outer, Kas Thomas, reported on his blog, assertTrue( ), a very interesting finding regarding trends in complementary trinucleotides in protein coding genes of organisms with high GC content, such as Streptomyces griseus. Being a bit of a DNA junky myself (pun intended), I consider his finding both ingenious and fascinating, and my mind revels in the possibilities of what it could mean. In summary, he has found that complementary trinucleotides, such as “CAG” and “CTG”, occur within the genes of certain GC-heavy species in fairly equal proportion. He has admitted that “it’s bizarre and crazy and deserves an explanation, and I’m hard pressed to come up with one”, but readily offers a number of interesting possibilities as to its explanation. For more on the specifics of his findings, please visit his blog posts here and here.
Given my personal interest in how primary sequence lends towards semi-predictable 3D conformations of DNA, the first thing that springs to my mind is whether the occurrence of complementary sequences upon the same strand of DNA has to do with a certain level of conservation of the shape of the local gene, a general shape which is shared by many such genes.
Just as a simplistic illustration for example, picture the complementary strands of DNA as a wavy line (it isn’t a 2D structure but for the sake of simplicity let’s consider it as such). Now take a given trinucleotide, in this instance we’ll just grab a “CTG”. And let’s say that this CTG trinucleotide has a somewhat convex shape to it. Its complement trinucleotide, “GAC”, must therefore have a concave shape otherwise it wouldn’t be able to bind well (chemistry slightly aside, we’re just talking shapes). Now let’s say that only the CTG/GAC combination can attain this unique angle of convexity/concavity, this “apex” as it were (see image below). So let’s then assume that a particular number of apexes, spaced consistently from one another, is a necessary part in the maintenance of the overall local shape of the DNA, and that shape in turn is vital for how it functions. It could then be feasible to expect alternating complements on a single strand, leading, roughly, to a correlation in the numbers of complements. As Merlin from The Sword in the Stone might say, “For every to, there is a fro.” And that’s what makes the double helix go round.
This is just a very simple, structural hypothesis, and why the correlation in complementary content is especially so for genomes which house high GC content is a current mystery. Although it is recognized that the GCs, aside from adopting their own unique dinucleotide rotatory angle, also maintain especially strong bonds, making their separation from the complement strand that much more difficult to achieve. Perhaps higher A-T content allows more structural leeway, e.g., conformational wiggle room, whereas GC content is all-or-nothing.
In any case, a fascinating finding by Kas Thomas. Definitely give the original blog posts a read.
Thank you for mentioning my blog! Awesome.
I do think the statistically too-high occurrence of complementary codons speaks to a great deal of not-yet-noticed secondary structure in DNA, particularly in single strands that coil against themselves rather than against the “opposite strand.” We’re just in the early stages of investigating DNA secondary structure of this kind. I expect many great discoveries will be unveiled in the next few years as we find that secondary structure is exploited in various ways by various genes in various organisms, including GC-rich regions of the human genome.
My pleasure. You had an excellent piece with some fascinating observations!