The principal goal of the
Human Genome Project (HGP), the decoding of the one billion bases that constitute the
human genome, will be ready by the middle of next year. Within the genome are interspersed the 100,000
genes (Some experts argue that this number could be as low as 35,000, while others believe it could reach 150,000), that is, the instructions to make the
proteins that make you a human being.
I think that the HGP can be compared only to the project that took the man to the moon. Just imagine that less than 25 years ago it was thought that the sequencing of the bacteria E. Coli genome (one million bases) could take more than 100 years.
Before continuing, let me tell you a few facts about genes and its malfunctions. Strictly speaking a gene is a
DNA sequence composed of four chemicals known as
nucleotides which are Adenine, Thymine, Cytosine and Guanine, however they usually, tough not correctly, referred to as
bases and abbreviated as A, T, C and G, respectively.
Genes vary in length and sequence according to the protein they code for. They are the instructions for the cell to synthesize proteins. A process also known as gene expression.
Gene expression is controlled through the interactions of specific DNA sequences with specific protein factors. Each gene has its own
control sequences. You can think of those sequences as switches that are turned on or off by the protein factors, thus allowing the gene to be expressed or not. If you want to see the process of protein synthesis at the molecular level take a look at this
video made by
Gary Anderson from the Department of Biological Sciences of the University of Southern Mississippi.
When there is a misspelling, a mutation of the gene sequence, the cell might end up with proteins that are shorter, larger, or of a different shape than the normal one, which might lead to disease.
On the other hand if you have an error in the control sequences the cell might synthesize too little, too much, or nothing of the encoded protein at all. Alternatively the protein may also synthesize at the wrong time or in the wrong tissue, which might also lead to disease.
The result of the HGP is nearly impossible to grasp. It would take you, more or less, 26 years to read the code, working 24 hours a day every day. It will be very boring since it will be like reading a book with a billion characters composed only of the four letters A, T, C and G.