Each amino acid in a protein sequence is represented by a 3-letter 'word' (codon) in the genetic code. Since there are 4 'letters' (A,C,G,T) there are 64 potential words to represent 20 amino acids, plus stop codons. The code is unambiguous - each codon represents only a single amino acid. It also has redundancies - most amino acids are represented by multiple codons (glycine, for example, can be represented 4 different ways). One might think that the diversity of life on the planet would come with a diverse difference in codon usage. This is not the case. There are differences in codon preference, both within and across species but usage is almost universal. For example, in humans the triplet ATC (20.8 codons/1000 codons) is preferred over the triplet ATA (7.5 codons/1000 codons) and in the yeast S. cerevisiae ATT is preffered to both of those (30.1 codons/1000 codons). However, in each of those cases - and virtually every other species - all three of those triplets code for isoleucine. Codon preference is related to abundance of the respective transfer RNA. (Larry Moran touches upon codon bias and why mutations that change the codon but not the amino acid may not be neutral in an article here)There is experimental evidence for a universal genetic code:
mRNAs can be correctly translated by the protein synthesizing machinery of very different species. For example, human hemoglobin mRNA is correctly translated by a wheat-germ extract [...] bacteria efficiently express recombinant DNA molecules encoding human proteins such as insulin.A universal code is the basis of many techniques (and headaches) in the lab. For example, in vitro protein synthesis can involve rabbit reticulocyte lysates (or wheat germ, as above) translating non-rabbit proteins. Non-mouse sequences can be used to introduce genes into mice. E. coli is often used for recombinant protein production. In this latter case, the difference in codon preference between E. coli and other species is a common problem for high level recombinant expression (eg. if a codon is preferred in humans - CCC for proline - but not in E. coli, this limiting tRNA could hinder protein production).
(Stryer, L. Biochemistry 3rd Ed. p 108)
That the genetic code is universal is not entirely true; some inter-species differences are being discovered. There are some species, such as ciliated protozoa have slight variations (in ciliates, TAA and TAG are glutamine rather than stop codons). Mitochondria are another important exception.
Mitochondria carry their own circular DNA which encodes for, among other things, a set of 22
tRNAs. Because it doesn't use the set of nuclear-encoded tRNAs, it isn't restricted to the standard code. In fact, human mitochondrial codon use differs from nuclear codon use in 4 places. For example, in the isoleucine example above, the codon AUA codes for methionine in mitochondria (see table, reproduced from Stryer). This isn't news, but something I failed to appreciate before. A difference in codon usage between species might not be surprising (in fact the consistancy in usage among species is surprising - until you consider the far reaching effects a change in codon use would have: Every protein would be affected). A difference in usage within a single cell is more striking, unless you're familiar with endosymbiotic theory.Endosymbiotic theory, popularized by Lynn Margulis, describes the origins of eukaryotic organelles: mitochondria and chloroplasts. These organelles were once autonomous organisms that were taken up by other cells in a symbiotic relationship. Both organelles have strong resemblences to the proposed parent prokaryotes, as detailed in the above link. Codon use separate from nuclear DNA can be added to that list.
Read more about the different codon usage sets here.
Codon preference numbers from here.


Podcast




3 comments:
Post a Comment