bioinformatics

 
Home

syllabus

general information

homework

lectures

websites

Will Terzaghi's Homepage



Membership

Login

 
 

Project 1

Biology 398INA: Topics in Bioinformatics
Project # 1: Characterizing a gene family by means of bioinformatics

Due March 14, 2003

"project1.doc"

Pick a gene by name, and run it through the entire suite of procedures.

Either find the orthologues from at least 10 different species, or find at least 10 different paralogues from the same species (note, mutations consisting of just a few bases don't count as paralogues for this assignment)!


Entrez to find one

  1. BLAST/FASTA to find homologues
  2. BLOCKMAKER, PFSCAN or other programs to find conserved regions
  3. PSI-BLAST, COBBLER, LAMA to find more distant homologues
  4. VAST & combinatorial extension to find related genes by structure
  5. ClustalW and other multiple sequence alignments to find and highlight conserved regions
  6. PHYLIP to construct phylogenies using these multiple alignments:
    • One based on distance matrix methods
    • One based on maximum parsimony
    • One based on maximum likelihood (you may need to use DNA sequences for this)
  7. Evaluate your trees
    • bootstrap analysis
    • shuffling order or other online resources
  8. Use MacVector or Primer3, etc to design primers to study gene in populations
  9. Pick an organism that has genomic sequence including your gene, then use Genefinder, etc to find nearby genes in genomic DNA
  10. CODEHOP to design primers to fish out related genes from cDNA of a new organism
  11. Design a protocol for cloning the coding sequence of one of these genes into Bluescript
    • Identify ORF
    • Design primers to cut as close to start & stop as possible, evaluate them using MacVector or primer finder
    • Describe procedure for performing the cloning, including restriction enzymes used, how you will prepare DNA, and print map of recombinant plasmid
    • print 50 bp of sequence at the junctions between the plasmid and cloned DNA

Report: email me the results for each of these tests

Suggested Paralogues (i.e multi-gene families within an organism)

  • Globins
  • Steroid receptors
  • keratin
  • Cytochrome P-450
  • DEAD box
  • Sodium channel
  • Leucine Zipper proteins
  • Myosin
  • Actin
  • Multi-drug resistance
  • Porins
  • Integrins
  • Tyrosine kinase receptors
  • G-protein-linked receptors
  • G-proteins
  • tripartite motif protein
  • Major Histocompatibility Complex proteins
  • Homeodomain proteins

Suggested Orthologues (these are trickier, because you can’t be certain whether you are looking at the correct counterpart unless there is only one copy of the gene)

  • Malonyl-CoA transacylase
  • Catalase
  • Myoglobin
  • Cytoplasmic malate dehydrogenase
  • beta-hydroxyacyl-ACP dehydratase
  • lauroyl acyltransferase
  • HMG-CoA synthase
  • HMG-CoA reductase




Last update: Friday, February 7, 2003 at 4:39:40 PM.