week 4 homework
Biology 398INA: Topics in Bioinformatics
Homework # 4: Multiple sequence alignment
Due February 8, 2003
Please send me your answers by email. You can either create a new file,
or download the ms word file and type in your answers.
"week 4.doc"
Part I: Multiple alignment
A. Go to http://www.umanitoba.ca/faculties/afs/plant_science/courses/39_769/lec06/lec06.2.html
- Why can the PIMA alignment be thought of as occurring in a "star"
configuration?
- How does the efficiency of a tree alignment compare with that of a star
alignment?
- What subtle feature distinguishes tree alignments from star alignments?
B. Go to the BCM search launcher: (http://searchlauncher.bcm.tmc.edu/multi-align/multi-align.html)
C. Click on the "H" next to "MAP."
- Why is MAP good at producing an alignment where there are long
terminal or internal gaps in some sequences?
D. Click on the "H" next to "PIMA 1.4."
- How is each cluster multiply-aligned?
E. Copy the "CLUSTAL Query" from the "Week 4 websites."
F. Paste it into the BCM search launcher
G. Click the ClustalW 1.8 button, then click "submit."
H. Copy the output in the green rectangle beneath "Alignment Data
(Fasta format)," then click the BOXSHADE hyperlink at the bottom of the
page.
I. At the BOXSHADE site, select "RTF_old"
as the output format (this is important!), "consensus line with letters,"
and "other" as the input sequence format, then click "Run BOXSHADE."
J. Click the "here is your output number 1," then save the
resulting RTF file and attach it to your homework. Note: you can do this using
MacVector or at any other site that runs CLUSTAL W, except that you will have
more trouble saving the output!
K. Repeat steps A-F, except click the PIMA 1.4 button at step C.
L. Repeat steps A-F, except click the MAP button at step C.
1. Do you see any differences between the three alignments in the order
in which the sequences are listed?
2. Do you see any differences between the three alignments in the consensus?
3. Do you see any other differences between the three alignments?
M. Go to the biologist's workbench (http://workbench.sdsc.edu/) and load these
sequences into your proteins folder.
N. Align them using CLUSTAL W and MSA.
O. When you receive your output, click on the "import sequences" buton. This
will save your alignments to your "alignment tools" folder.
P. Print out the alignments using Texshade - one of the programs that you can
select in the "alignment tools" folder (try varying the settings to get the
most useful information about similarities and differences).
1. Are there any differences between the two alignments?
2. Are there any differences between the ClustalW alignments from WorkBench
& SearchLauncher?
Part II: Finding BLOCKs
A. Go to the BLOCKS website (http://blocks.fhcrc.org/blocks/) and click
on "About Blocks."
- What are BLOCKS?
- How are the blocks for the Blocks Database made?
B. Copy the "Clustal query" sequences, then go to the BLOCKS website
and paste them into BLOCKMAKER.
- How many blocks did you find?
- Were any known?
C. Your output should include a COBBLER sequence from MOTIF,
and below this hyperlinks for a BLAST search and BLAST-PSI search using this
sequence. Perform a BLAST search.
- How many significant hits did you get with BLAST?
- Were any not 3- keto acyl-ACP reductases or oxo-acyl-ACP reductases?
D. Now click on the 3D-blocks button immediately below.
- How many hits did you get?
- Were any of proteins not identified by BLAST?
E. Near the top of your BLOCKS output (2 lines below the Proweb tree
viewr button) you will see "search" with LAMA and about LAMA hypertext
next to it. Click on About LAMA.
- What does LAMA do?
- What can LAMA do for you?
F. Close the LAMA help window, and click on LAMA. You will get a new
window with your blocks entered in the correct format. Fill in your email address
(highly recommended!) then click the "perform the search" button.
1. How many hits did you get?
2. Were any not found by BLAST?
Part III: Finding motifs
A. Copy the "Clustal query" sequences, then go to the meme website
(http://meme.sdsc.edu/meme/website/ or try the French site http://bioweb.pasteur.fr/seqanal/motif/meme/),
select meme-submission form, then paste your sequences into the window, enter
your eamil address, then click the "start search " button.
- How many motifs does it find?
- Are any different from the sites identified by BLOCKMAKER?
B. Go to the biologist's workbench (http://workbench.sdsc.edu/), select
"protein tools, then select gi|1345959| and analyze it with HMMPFAM.
- What motifs does it identify?
C. Now analyze gi|1345959| with PFSCAN.
- What motifs does it identify?
D. Now copy the sequence for gi|1345959| then analyze it with PFSCAN at prosite: http://hits.isb-sib.ch/cgi-bin/PFSCAN
- What motifs does it identify?
- Are there any differences between the two sites?
E. Now copy the sequence for gi|1345959| then analyze it with interproscan http://www.ebi.ac.uk/interpro/scan.html
- What motifs does it identify?
- Are any not previously identified?
F. Now analyze P49327 (gi|1345959| ) with PFam at the Sanger Institute http://www.sanger.ac.uk/Software/Pfam/search.shtml
- What motifs does it identify?
- Are any not previously identified?
G. Now go to "Other motif sites" http://us.expasy.org/tools/#pattern and pick three other sites
- Who operates the site?
- What does it do?
- Does it allow you to find something you couldn't find at any of the sites listed above?
Appendix: Using CLUSTAL W within MacVector
- Start MacVector
- Create a new protein sequence file for each sequence in the "Clustal
query" by selecting FILE| NEW, then protein, and pasting in the amino
acid sequence (leave out the header).
- Leave each file open on the desk top.
- Choose Analyze| Clustal W Alignment from the menu at the top of the screen.
- A new window will open. Make sure that the sequences to align are displayed,
then leave everything else alone and click OK.
- You will get an alignment displays option box. Select "create
consensus sequence," multiple alignment and guide tree under Picture
display.
- Print out the “aligned sequences” using the default settings,
then after coloring by “Acidity + Basicity “and by “Hydrophobicity
+ Charge.”

|