Genomics

The study of entire genomes

Some sample sizes

[[Phi]]X 174 (first complete sequence)

5,000 bp

Yeast ch. 3 (first seq. ch.)

350,000 bp

E. coli genome

4.6 x 106 bp

largest yeast chromosome

1.5 x 106 bp

entire yeast genome

12 x 106 bp

smallest human chromosome (Y)

50 x 106 bp

largest human chromosome (1)

250 x 106 bp

entire human genome

  • 399,000 pages of text
  • 536 1.4 Mb floppies
  • >1 CD

3.3 x 109 bp

Now have the complete sequence of 19 different organisms (4 archaebacterium, 13 eubacteria, and 2 eukaryotes)

  • The first complete sequence was for Haemolphilus influenzae (a gram negative bacteria)
    • 1,830 Kb; 1,743 genes - 40% of unknown function
  • Smallest is Mycoplasma genitalium (a gram positive bacteria)
    • 580 Kb; 482 genes; 470 coding sequences - 88% of genome, average size=1Kb, 96 genes with no known match in other species
    • comparison of the two genomes suggests that the minimal prokaryotic organism would need 250 genes
    • minimal cellular organism would need 125 genes (use an RNA genome and eliminate all duplicated genes)
  • An archaebacterium, Methanococcus jannaschii
    • 1,700 Kb; 1,738 protein coding genes, 62% with no known function
    • metabolism genes are prokaryotic like, transcription, translation, and replication genes are eukaryotic
  • The first eukaryote sequenced, Saccharomyces cerevisiae
    • 12,068 Kb in 16 different chromosomes
    • 5,885 potential genes (6,275 ORFs), 70% of genome
      • 140 rRNA genes, 40 small nuclear RNA genes, 275 tRNA genes and 52 transposable elements (Ty1 and Ty2)
    • many of the differences between homologous chromosomes are due to transpostions events
    • many duplicated regions (with small differences - cluster homology regions)
    • S. cerevisiae Proteome
      • the total of all of the proteins
        • 50% can be classified by homology with known genes, 1000 already had known functions (out of 5,885)
        • 11% are genes for metabolism, 7% for transcription, 6% translation, down to 200 different transcription factors
  • The first multicellular eukaryote to be sequenced was C. elegans (a round worm), finished Dec., 1998
    • A model organism for development studies
      • 659 somatic cells with a compete description of embryonic history
    • 97 million bp genome
      • 19,099 genes
      • 25% in operons (oops!)
      • 42% match genes in other organisms than nematodes
      • Another 34% match other nematode genes
      • half have no known function
  • Can now study function by reversing classical genetics; instead of finding a mutant phenotype and working back to the gene, start with the gene, mutate it (knock-outs) and look for the altered phenotype

Human Genetic Diseases

  • Enzyme defiencies
    • PKU 1/10,000
    • Albinism 1/17,000
    • Lesh-Nyhan
    • Tay Sachs (only common in Ashkenazi Jews)
  • Other (developmental, neurological, regulatory, transport, etc.)
    • Cystic Fibrosis
  • Cancer genes
  • Fragile X - most common form of inherited mental retardation
    • 1/1500 in males, 1/2500 in females
    • caused by expansion of a 3 bp repeat in FMR-1 (CGG)
      • normal ind. have 6-54 repeats
      • carrier females have 50-200 repeats
      • 200-1300 repeats in affected individuals
      • if repeat gets larger than 50 - a very high mutation rate

Human Genome Project Goals

  • Complete a detailed human genetic map 2 Mb
    • map 3,000 markers (RFLP's, VNTR's, microsatellites,etc.)
  • complete a physical map 0.1 Mb
    • a complete restriction map
  • Aquire the genome as clones 5 kb
    • an ordered contig map
  • Determine the complete sequence 1 bp
    • at $1-2/base will cost app. 3 billion dollars
  • find all genes and determine their function


[ Biol 207 ] [ Bell ] [ CSU Chico ] [ Library ]
This document is maintained by: Jeff Bell
Last Update: Monday, May 10, 1999