GENERAL CONSIDERATIONS As DNA is a genetic material, it carries genetic informations from cell to cell and from generation to generation. At this stage, an attempt will be made to determine that in what manner the genetic informations are existed in DNA molecule ? Are they written in articulated or coded language on DNA molecule? If in the language of codes, what is the nature of genetic code ? A DNA molecule is composed of three kinds of moieties: (i) phosphoric acid, (ii) deoxyribose sugar, and (iii) nitrogen bases. The genetic informations may be written in any one of the three moieties of DNA. But the poly-sugarphosphate backbone is always the same, and it is therefore unlikely that these moiteies of DNA molecule carry the genetic informations. The nitrogen bases, however, vary from one segment of DNA to another, so the informations might well depend on their sequences. The sequences of nitrogen bases of a given segment of DNA molecule, actually has been found to be identical to linear sequence of amino acids in a protein molecule. The proof of such a colinearity between DNA nitrogen base sequence and amino acid sequence in protein molecules was first obtained from an analysis of mutants of head protein of bacteriophage T4 (Sarabhai et al, 1964) and the A protein of tryptophan synthetase of Escherichia coli (Yanofski et al, 1964). The colinearity of protein molecules and DNA polynucleotides has given the clue that the specific arrangement of four nitrogen bases (e.g., A, T, C and G) in DNA polynucleotide
810 FUNDAMENTALS OF BIOCHEMISTRY chains somehow determines the sequence of amino acids in protein molecules. Therefore, these four DNA bases can be considered as four alphabets of DNA molecule. All the genetic information, therefore, should be written by these four alphabets of DNA. Now the question arises that whether the genetic informations are written in articulated language or coded language ? If genetic informations might have occurred in an articulated language, the DNA molecule might require various alphabets, a complex system of grammer and ample amount of space on it. All of which might be practically impossible and troublesome too for the DNA. Therefore, it was safe to conclude for molecular biologists that genetic informations were existed in DNA molecule in the form of certain special language of code words which might utilize the four nitrogen bases of DNA for its symbols. Any coded message is commonly called cryptogram. NATURE OF THE GENETIC CODE Earlier, Gamow, the well-known nuclear physicist, proposed that the genetic code consists of three nitogenous (N) bases and the adjacent triplets overlap. This meant that at any particular point the same N-base occurs three times in a vertical manner instead of one which is expected on the basis of colinear model. This hypothesis, however, was not accepted on the following grounds : 1. In the overlapping model only certain amino acids can follow certain others. After the first amino acid in a protein is coded, the next two and for that matter the remaining amino acids in the protein are partially predetermined. If the first code is CAG, then the next must begin with AG and the third one with G. 2. Mutation involving a change in one base, according to this hypothesis, must involve three amino acids. George Gamow (LT, 1904-1968), a Russian born US nuclear physicist and cosmologist, was one of the foremost advocates of the ‘Big-bang theory’. He is perhaps best known for his popular writings, designed to introduce to the nonspecialist such diffuse subjects as relativity and cosmology. His popular writings include : (1) The creation of the Universe, (2) A Planet called Earth, and (3) A Star Called the Sun. For his achievements as a popularizer of science, Gamow was awarded the 1956 Kalinga Prize by UNESCO. In order to find the arrangement of codons, in later experiments, it was found that when a change occurs due to a mutation, it is confined only to one amino acid. For instance, when sickle cell anemia occurs, only one amino acid, namely glutamic acid is changed into valine, the two adjacent amino acids remaining unaffected. Further research showed that the codons are arranged in a linear order. This explains as to why the change in one involves only one amino acid and not three; if Gamow’s hypothesis were correct, change of one nitrogenous base would have involved 3 amino acids. The sequence of bases that encodes a functional protein molecule is called a gene. And the genetic code is the relation between the base sequence of a gene and the amino acid sequence of the polypeptide whose synthesis the gene directs. In other words, the specific correspondence between a set of 3 bases and 1 of the 20 amino acids is called the genetic code. J.D. Burke (1970) defined genetic code in the following words, “The genetic code for protein synthesis is contained in the base sequence of DNA. ... The genetic code is a code for amino acids. Specifically, it is concerned with what codons specify what amino acids.” The genetic code is the key that relates, in Crick’s words, “...the two great polymer languages, the nucleic acid language and the protein language.” The “letters” in the “language” were found to be the bases; the “words” (codons) are groups of bases; and the “sentences” and “paragraphs” equate with groups of codons (Eldon J. Gardner, 1968).
GENETIC CODE 811 Thus, Letters ≡ Bases Words ≡ Groups of bases (i.e., codons) Sentences and Paragraphs ≡ Groups of codons The basic problem of such a genetic code is to indicate how information written in a four-letterlanguage (four nucleotides or nitrogen bases of DNA) can be translated into a twenty-letter-language (twenty amino acids of proteins). The group of nucleotides that specifies one amino acid is a code word or codon. The simplest possible code is a singlet code (a code of single letter) in which one nucleotide codes for one amino acid. Such a code is inadequate for only four amino acids could be specified. A doublet code (a code of two letters) is also inadequate because it could specify only sixteen (4 × 4) amino acids, whereas a triplet code (a code of three letters) could specify sixty four (4 × 4 × 4) amino acids. Therefore, it is likely that there may be 64 triplet codes for 20 amino acids. The possible singlet, doublet and triplet codes, which are customarily represented in terms of “mRNA language”, (mRNA is a complementary molecule which copies the genetic informations during its transcription) can be illustrated as in Fig. 30–1. Larger than three letter units would seem wasteful and evidence already accumulated suggests that such larger units are unlikely. Fig. 30–1. Singlet, doublet and triplet codes of mRNA
812 FUNDAMENTALS OF BIOCHEMISTRY The first experimental evidence in support to the concept of triplet codes is provided by Crick and coworkers in 1961. During their experiment, when they added or deleted single, or double base pairs in a particular region of DNA of T4 bacteriophages of E. coli, they found that such bacteriophages ceased to perform their normal functions. However, bacteriophages with addition or deletion of three base pairs in DNA molecule, had performed normal functions. From this experiment, they concluded that a genetic code is in triplet form because the addition of one or two nucleotides has put the reading of the code out of order, while the addition of third nucleotide resulted in a return to the proper reading of the message. A strong evidence in favour of triplet coding units is derived from determinations of the coding ratio. Coding ratio is equal to the number of nucleotides in the mRNA (or nucleotide pairs in the double-stranded DNA) divided by the number of amino acid residues of the resultant polypeptide chain. This expresses the number of nucleotides per coding unit. In each of the several genes so far studied, the number of nucleotide pairs of DNA has been estimated by genetic techniques and compared with the number of amino acid residues of the protein synthesised by the gene. All estimates give coding ratios close to three, indicating that three nucleotides compose a coding unit (triplet code). A wide variety of genetic experiments is consistent with a triplet code. The conclusion is inescapable that sequences of 3 bases in the mRNA molecule (triplet) contain coded information for the various amino acids. Such a triplet is called a codon. Each triplet codon specifies only one particular amino acid and the position of the codon in the mRNA molecule specifies the position of the amino acid in a polypeptide chain. As stated above, with four bases, 64 triplet codons (43) are possible. The genetic dictionary, thus, consists of 64 words, each made of a specific sequence of 3 out of 4 letters of the genetic alphabet. Any letter may occur more than once in a codon. An example from the English language may be used to explain this. Consider a 4-letter word “SEAT”. Out of the four letters of this word, you can make many 3-letter words, each of which conveys a definite meaning, e.g., “SEA”, “SEE”, “SET”, “SAT”, “EAT”, “ASS”, “ATE” and “TEA”. The genetic code has now been experimentally deciphered and perfected by the combined efforts of many biochemists, notably Marshall Warren Nirenberg and Har Gobind Khorana, who were awarded the 1968 Noble Prize for their work, along with Robert Holley who was the first scientist to determine the nucleotide sequence of several tRNAs. THE GENETIC CODE The genetic language consists of only four letters contained in the word “GACU”. These four letters can be combined to form 64 genetic words, each consisting of 3 letters. Each triplet word (codon ) has a specific meaning which the cell understands. It codes for a particular amino acid. The genetic code, as at present known, is shown in Fig. 30–2. It shows the base sequences of the various codons (triplets) and against each codon is given the amino acid that it codes. Just as different combinations of different words make different sentences, each having a specific meaning, similarly different sequences of codons on mRNA specify different proteins, each with a specific sequence of amino acids. Although the genetic information resides in DNA, the terms genetic code and codon are used with reference to mRNA because mRNA is the nucleic acid which directly determines the sequence of amino acids in a protein. This expression of genetic information in the amino acid sequence of proteins by mRNA is called translation. The DNA-RNA-Protein code may be expressed as under :
810 FUNDAMENTALS OF BIOCHEMISTRY chains somehow determines the sequence of amino acids in protein molecules. Therefore, these four DNA bases can be considered as four alphabets of DNA molecule. All the genetic information, therefore, should be written by these four alphabets of DNA. Now the question arises that whether the genetic informations are written in articulated language or coded language ? If genetic informations might have occurred in an articulated language, the DNA molecule might require various alphabets, a complex system of grammer and ample amount of space on it. All of which might be practically impossible and troublesome too for the DNA. Therefore, it was safe to conclude for molecular biologists that genetic informations were existed in DNA molecule in the form of certain special language of code words which might utilize the four nitrogen bases of DNA for its symbols. Any coded message is commonly called cryptogram. NATURE OF THE GENETIC CODE Earlier, Gamow, the well-known nuclear physicist, proposed that the genetic code consists of three nitogenous (N) bases and the adjacent triplets overlap. This meant that at any particular point the same N-base occurs three times in a vertical manner instead of one which is expected on the basis of colinear model. This hypothesis, however, was not accepted on the following grounds : 1. In the overlapping model only certain amino acids can follow certain others. After the first amino acid in a protein is coded, the next two and for that matter the remaining amino acids in the protein are partially predetermined. If the first code is CAG, then the next must begin with AG and the third one with G. 2. Mutation involving a change in one base, according to this hypothesis, must involve three amino acids. George Gamow (LT, 1904-1968), a Russian born US nuclear physicist and cosmologist, was one of the foremost advocates of the ‘Big-bang theory’. He is perhaps best known for his popular writings, designed to introduce to the nonspecialist such diffuse subjects as relativity and cosmology. His popular writings include : (1) The creation of the Universe, (2) A Planet called Earth, and (3) A Star Called the Sun. For his achievements as a popularizer of science, Gamow was awarded the 1956 Kalinga Prize by UNESCO. In order to find the arrangement of codons, in later experiments, it was found that when a change occurs due to a mutation, it is confined only to one amino acid. For instance, when sickle cell anemia occurs, only one amino acid, namely glutamic acid is changed into valine, the two adjacent amino acids remaining unaffected. Further research showed that the codons are arranged in a linear order. This explains as to why the change in one involves only one amino acid and not three; if Gamow’s hypothesis were correct, change of one nitrogenous base would have involved 3 amino acids. The sequence of bases that encodes a functional protein molecule is called a gene. And the genetic code is the relation between the base sequence of a gene and the amino acid sequence of the polypeptide whose synthesis the gene directs. In other words, the specific correspondence between a set of 3 bases and 1 of the 20 amino acids is called the genetic code. J.D. Burke (1970) defined genetic code in the following words, “The genetic code for protein synthesis is contained in the base sequence of DNA. ... The genetic code is a code for amino acids. Specifically, it is concerned with what codons specify what amino acids.” The genetic code is the key that relates, in Crick’s words, “...the two great polymer languages, the nucleic acid language and the protein language.” The “letters” in the “language” were found to be the bases; the “words” (codons) are groups of bases; and the “sentences” and “paragraphs” equate with groups of codons (Eldon J. Gardner, 1968).
GENETIC CODE 811 Thus, Letters ≡ Bases Words ≡ Groups of bases (i.e., codons) Sentences and Paragraphs ≡ Groups of codons The basic problem of such a genetic code is to indicate how information written in a four-letterlanguage (four nucleotides or nitrogen bases of DNA) can be translated into a twenty-letter-language (twenty amino acids of proteins). The group of nucleotides that specifies one amino acid is a code word or codon. The simplest possible code is a singlet code (a code of single letter) in which one nucleotide codes for one amino acid. Such a code is inadequate for only four amino acids could be specified. A doublet code (a code of two letters) is also inadequate because it could specify only sixteen (4 × 4) amino acids, whereas a triplet code (a code of three letters) could specify sixty four (4 × 4 × 4) amino acids. Therefore, it is likely that there may be 64 triplet codes for 20 amino acids. The possible singlet, doublet and triplet codes, which are customarily represented in terms of “mRNA language”, (mRNA is a complementary molecule which copies the genetic informations during its transcription) can be illustrated as in Fig. 30–1. Larger than three letter units would seem wasteful and evidence already accumulated suggests that such larger units are unlikely. Fig. 30–1. Singlet, doublet and triplet codes of mRNA
812 FUNDAMENTALS OF BIOCHEMISTRY The first experimental evidence in support to the concept of triplet codes is provided by Crick and coworkers in 1961. During their experiment, when they added or deleted single, or double base pairs in a particular region of DNA of T4 bacteriophages of E. coli, they found that such bacteriophages ceased to perform their normal functions. However, bacteriophages with addition or deletion of three base pairs in DNA molecule, had performed normal functions. From this experiment, they concluded that a genetic code is in triplet form because the addition of one or two nucleotides has put the reading of the code out of order, while the addition of third nucleotide resulted in a return to the proper reading of the message. A strong evidence in favour of triplet coding units is derived from determinations of the coding ratio. Coding ratio is equal to the number of nucleotides in the mRNA (or nucleotide pairs in the double-stranded DNA) divided by the number of amino acid residues of the resultant polypeptide chain. This expresses the number of nucleotides per coding unit. In each of the several genes so far studied, the number of nucleotide pairs of DNA has been estimated by genetic techniques and compared with the number of amino acid residues of the protein synthesised by the gene. All estimates give coding ratios close to three, indicating that three nucleotides compose a coding unit (triplet code). A wide variety of genetic experiments is consistent with a triplet code. The conclusion is inescapable that sequences of 3 bases in the mRNA molecule (triplet) contain coded information for the various amino acids. Such a triplet is called a codon. Each triplet codon specifies only one particular amino acid and the position of the codon in the mRNA molecule specifies the position of the amino acid in a polypeptide chain. As stated above, with four bases, 64 triplet codons (43) are possible. The genetic dictionary, thus, consists of 64 words, each made of a specific sequence of 3 out of 4 letters of the genetic alphabet. Any letter may occur more than once in a codon. An example from the English language may be used to explain this. Consider a 4-letter word “SEAT”. Out of the four letters of this word, you can make many 3-letter words, each of which conveys a definite meaning, e.g., “SEA”, “SEE”, “SET”, “SAT”, “EAT”, “ASS”, “ATE” and “TEA”. The genetic code has now been experimentally deciphered and perfected by the combined efforts of many biochemists, notably Marshall Warren Nirenberg and Har Gobind Khorana, who were awarded the 1968 Noble Prize for their work, along with Robert Holley who was the first scientist to determine the nucleotide sequence of several tRNAs. THE GENETIC CODE The genetic language consists of only four letters contained in the word “GACU”. These four letters can be combined to form 64 genetic words, each consisting of 3 letters. Each triplet word (codon ) has a specific meaning which the cell understands. It codes for a particular amino acid. The genetic code, as at present known, is shown in Fig. 30–2. It shows the base sequences of the various codons (triplets) and against each codon is given the amino acid that it codes. Just as different combinations of different words make different sentences, each having a specific meaning, similarly different sequences of codons on mRNA specify different proteins, each with a specific sequence of amino acids. Although the genetic information resides in DNA, the terms genetic code and codon are used with reference to mRNA because mRNA is the nucleic acid which directly determines the sequence of amino acids in a protein. This expression of genetic information in the amino acid sequence of proteins by mRNA is called translation. The DNA-RNA-Protein code may be expressed as under :
Comments
Post a Comment