Gene Structure: Searching Genbank and Interpreting the Results

Genbank: The basic information about all known genes is stored in a database called Genbank.This is a searchable database with a wealth of information about all studied genes. This Genbank sample page has definitions of many of the terms used in a genbank record. You can also search several different genetics texts for explanations of any unfamilar terms at the Genbank Books site. This assignment has you go to the Darwin 2000 page on Genbank searching, a tutorial on how to search Genbank and read a Genbank record, and answer the following questions on reading Genbank records from the list of questions there.

Answer questions 1-12, 14-16.

Before beginning the assignment, reading the following background material about the hemoglobin gene and basic properties of genes may help you understand and answer some of the questions.

Hemoglobin: During the late 1950s and early 1960s researchers learned a lot about gene function by studying the protein hemoglobin. Hemoglobin is a tetramer of two subunits, alpha and beta. Each subunit binds a heme molecule. The iron atom at the center of each heme can bind one oxygen molecule. Each red blood cell in your blood is filled with millions of molecules of hemoglobin, which is used to transport oxygen. Studies using hemoglobin were the first to show that an alternative phenotype could be caused by the change of a single amino acid in the sequence of a protein. Studies of various hemoglobin mutants were important in learning how changes in the DNA sequence could have so many different possible effects.

Sickle-cell Disease: One of the more common genetic diseases in humans, especially in Africa, is sickle-cell disease. A change in a single base pair of the gene for the beta-subunit of hemoglobin causes a change in one amino acid which leads to an alteration of the shape of the red-blood cells (sickling). The altered red blood cells are rapidly removed from the circulation, leading to anemia and a range of other harmful affects.

Malaria: Chills, fever, headache and vomiting are but a few symptoms of the disease called malaria. Malaria is caused by a protozoan, Plasmodium vivax, that lives in tropical countries. Plasmodium reproduces in a mosquito called the Anopheles mosquito. Plasmodium is transmitted to humans when an infected Anopheles mosquito bites a human and an infectious stage of Plasmodium called sporozoites enter the human bloodstream and travel to the liver. Sporozoites reproduce in liver cells and release progeny (merozoites) into the bloodstream that infect red blood cells, reproduce, and rupture these cells to infect more red blood cells. Individuals who are heterozygous for sickle-cell disease have a higher resistance to malaria than wild type individuals. This resistance occurs because the red blood cells of heterozygotes sickle when infected by Plasmodium, just like the red blood cells of the homozygotes. The sickled cells are removed from the blood when they get trapped in the narrow vessels of the spleen. The high incidence of malaria in Africa has presumabley led to natural selection of carriers of the sickle-cell mutation, leading to hich levels of this genetic disease in Africa.

Proteins: Look at the picture below showing the structure of the normal hemoglobin protein with the position of the mutated amino acid in the sickle cell hemoglobin protein marked in yellow. One hypothesis explaining the drastic effect of this small change in the protein is that the mutant proteins "clump together", and this clumping is then responsible for the physical changes in the red blood cell.

Sequence (DNA to hnRNA to mRNA to Protein): A gene is composed of DNA which is transcribed to RNA which is then translated into protein. In eukaryotes this process has the additional step of RNA processing between the synthesis of the RNA and translation. Thus, in addition to the mutations in the coding sequence (CDS) such as missense, nonsense, frameshift that you studied in the Hemoglobin Lab, it is also possible to have splicing mutations, that effect the splicing out of the introns, and regulatory mutants that effect the promoter sequences.

Genbank: The basic information about all known genes is stored in a database called Genbank.This is a searchable database with a wealth of information about all studied genes. This Genbank sample page has definitions of many of the terms used in a genbank record. You can also search several different genetics texts for explanations of any unfamilar terms at the Genbank Books site. This assignment is for you to go to the Darwin 2000 page on Genbank searching and answer the following questions on reading Genbank records from the list of questions there.

Answer questions 1-12, 14-16.


This document is maintained by: Jeff Bell