bioinformatics

 
Home

syllabus

general information

homework

lectures

websites

Will Terzaghi's Homepage



Membership

Login

 
 

week 8 homework

Biology 398INA: Topics in Bioinformatics
Homework # 8
Using 2-dimensional modeling programs

Due March 15

Please send me your answers by email. You can either create a new file, or download the ms word file and type in your answers.

"week8homework"

Part I: Predicting RNA secondary structure

  1. Go to http://bioinfo.math.rpi.edu/~zukerm/
  2. Under “research,” select "An mfold manual: RNA folding by energy minimization."
  3. Go to the introduction.
    • What determines the activity of RNA?
    • What is partially responsible for translational controls in mRNA and replication controls in single-stranded RNA viruses?
    • Why is there a need for modeling RNA secondary structure?
  4. Now go to “Loops and Nearest neighbor rules.”
    • What are free energies assigned to?
    • Why are pseudoknots and base triples excluded?
    • Scroll to the bottom of the page. What are DNA folding parameters based on?
  5. Now go to Entrez (http://www.ncbi.nlm.nih.gov/Entrez) and select nucleotide.
  6. Type in " BC005462"
  7. Copy the complete DNA sequence (remember that it is a cDNA).
  8. Go back to http://bioinfo.math.rpi.edu/~zukerm/
  9. Under “research,” select The mfold server - Fold your RNA sequences.
  10. Paste your sequence into the box, select “a batch job” then submit your query using the default settings (be sure to give them your email address).
  11. When you get your results back,
    • What percentage of the base pairs in the energy dot-plot were included in the foldings?
    • What is the free energy for structure 1?
    • What is the free energy for structure 26?
  12. Select the jpg image for structure 1 and attach it to your homework (you will get many structures).

Part II: Predicting Protein secondary structure
  1. What is protein secondary structure?
  2. What are the four types of secondary structures?
  3. How does the Chou-Fasman algorithm predict secondary structures?
  4. Copy the complete amino acid sequence for BC005462 and load it into biologist's workbench.
  5. Run it through CHOFAS (under protein tools)
  6. How many alpha-helices does it predict?
  7. How many amino acids are in alpha-helices?
  8. How many beta-pleated sheets(only count ones with 5 or more aa in a row)?
  9. How many amino acids are in beta-pleated sheets?
  10. How many of the helices overlap with beta-pleated sheets?
  11. How can you decide which prediction is correct?
  12. How does the GOR algorithm predict secondary structure?
  13. Run BC005462 through GOR4 (under protein tools).
  14. How many alpha-helices does it predict (do not count anything shorter than 5)?
  15. How many beta-pleated sheets(only count ones with 5 or more aa in a row)?
  16. How many of the helices overlap with beta-pleated sheets?
  17. Run BC005462 through PELE (under protein tools).
  18. How many different programs does PELE use?
  19. How many involve neural networks or machine learning?
  20. What is JOI?
  21. How many alpha-helices were predicted by JOI?
  22. How many beta-sheets?
  23. How many beta turns?
  24. How do neural networks work?
  25. How are they trained?
  26. Go to http://www.cmpharm.ucsf.edu/~nomi/nnpredict-instrucs.html
    • How does nnPREDICT predict secondary structure?
    • How were the weights determined?
  27. Now run the amino acid sequence of BC005462 through nnPREDICT at http://www.cmpharm.ucsf.edu/~nomi/nnpredict.html
  28. How many alpha-helices does it predict?
  29. How many beta-pleated sheets?
  30. Now go to http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_server.html and select “references.”
  31. How does DPM work?
  32. How does DSC work?
  33. How does HNN work?
  34. How does PHD work?
  35. How does Predator work?
  36. How does Simpa 96 work?
  37. How does SOPM work?
  38. Now, select NPS@ at the top of the page
  39. Scroll down the NPS@ to SECONDARY STRUCTURE CONSENSUS PREDICTION.
  40. Run the amino acid sequence of BC005462 using the defaults.
  41. How many alpha-helices were predicted by the consensus?
  42. How many beta-pleated sheets were predicted by the consensus?
  43. Which program was the most different from the consensus?
  44. Which program was the closest?
  45. Now go to http://cubic.bioc.columbia.edu/predictprotein/submit_def.html and try running it through PredictProtein.
  46. How many alpha-helices?
  47. How many beta-pleated sheets?

Part III: Predicting other kinds of structures
  1. How does the Kyte-Doolittle algorithm predict hydrophobicity?
  2. Now go to Entrez (http://www.ncbi.nlm.nih.gov/Entrez) and select nucleotide.
  3. Type in " NM_079687" (this is the sequence for a rhodopsin. So far we have been dealing with a soluble protein because the secondary structure prediction programs work better with those, but now we’re switching to membrane bound ones)
  4. Copy the complete amino acid sequence for NM_079687 and load it into biologist's workbench.
  5. Run it through GREASE.
  6. What does the Y-axis represent?
  7. How many hydrophobic domains do you predict? (Count the number of peaks that reach at least 1.0 on the Y axis).
  8. Now run NM_079687 through TMAP
  9. How many trans-membrane domains does it predict and where are they?
  10. How do they compare with the hydrophobic peaks predicted by GREASE?
  11. Now run NM_079687 through TMHMM
  12. How many trans-membrane domains does it predict and where are they?
  13. How do they compare with the trans-membrane domains predicted by TMAP?
  14. Now go to http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_server.html
  15. Select “transmembrane helices prediction” under “miscellaneous analysis tools.”
  16. How does PHDhtm predict transmembrane domains?
  17. Now run NM_079687 through PHDhtm
  18. How many trans-membrane domains does it predict and where are they?
  19. How do they compare with the trans-membrane domains predicted by TMHMM?
  20. Return to http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_server.html
  21. Select “coiled-coil” under “miscellaneous analysis tools.”
  22. How does it predict coiled colis?
  23. Now run NM_079687 through Coiled-coil
  24. How many coiled-coils does it find?
  25. When you get back your Predict protein, you should also get output from PROSITE Motif seaches.
  26. Does BC005462 have any glycosylation sites?
  27. Any Protein Kinase C phosphorylation sites?
  28. Any Casein Kinase 2 phosphorylation sites?
  29. Any Tyrosine Kinase phosphorylation sites?
  30. Any myristoylation sites?
  31. Any ATP binding sites?
  32. Now go to Entrez (http://www.ncbi.nlm.nih.gov/Entrez) and select protein
  33. Type in " AAG50089" (this is the sequence for a plastocyanin).
  34. Copy the amino acid sequence, then go to http://www.cbs.dtu.dk/services/ and select “TargetP.”
  35. Click on abstract.
    • How does targetP work?
    • What percentage of plant proteins are mitochondrial?
  36. Go back to the TargetP server, paste your sequence into the window, select “plant” under “origin of sequences” and “perform cleavage site predictions,” then submit.
  37. Where does it predict plastocyanin will be localized?
  38. What is the probability that it has a chloroplast transit peptide?
  39. What is the predicted length of the transit peptide?




Last update: Thursday, March 13, 2003 at 4:34:16 PM.