week 9 lecture
Modeling 3-D protein structures
Most protein
structures have been obtained using X-ray crystallography
(for information go to http://www.rcsb.org/pdb/experimental_methods.html)
- proteins must be able to form crystals
- problem for membrane proteins, many others
- alternatives
- solution NMR
- high resolution electron microscopy
- circular dichroism
All these methods are time-consuming, very technical
and require sophisticated equipment and highly trained personnel. Therefore,
there is a great deal of interest in developing algorithms for modeling 3-D
protein structures
Start by modeling secondary structures, using approaches
described last week, since proteins fold in a stepwise manner due to interactions
with water and with each other
Next we try to predict which portions
will be at surface and which will be buried in center, based on hydrophobic/hydrophilic
interactions and similarity with proteins of known structure
Other programs look for similarity
with known motifs and domains.
Now we're ready to model 3-D structure!
2 General Approaches
- Ab initio: derive structure from
models based on principles of protein folding
- Knowledge-based
Ab initio
basic problem: each residue has at least 3 distinct possible conformations
w.r.t. secondary structure
- side chains also have many possible conformations
- modeling all possibilities takes too long
- it took 2 months on 2 CRAYs to “fold”
a 36-mer
- Can speed up the process by limiting the search space, just as FASTA
speeds up sequence alignments by limiting the search space
- one way to limit the search space is by
only looking at the conformations that amino acids are capable of adopting
within a protein
- peptide bonds are rigid, but rotation
is permitted between the nitrogen and the alpha carbon, and between
the alpha and carbonyl carbon atoms
- the angle of rotation between the
nitrogen and alpha-carbon is called f
- the angle of rotation between the
alpha and carbonyl carbons is called y
- only certain combinations of f
and y are permitted because
of collisions between the R group of one amino acid and the backbone
of the next
- Ramachandran plots describe the possible f
and y values for proteins
- note that 75% of f
and y combinations are excluded
because of collisions!
- therefore, limiting the search to allowable
combinations of f and y
dramatically accelerates calculations
- Twist and turn the residues within allowed
limits to find most likely conformation
Three types of approaches are used for ab initio modeling
- Deterministic methods use quantum mechanics
and diffusion equations to predict most likely conformation
- Energy minimization tries to model the energetics of the interactions between the amino acids and the solvent in order to predict the structure with the lowest energy
- Stochastic searches test various folds and
pick most likely
- Monte Carlo simulated annealing sequentially
moves each of the elements by a random step of a randomly chosen coordinate,
- Use a collection of different starting positions, compute
50 different trajectories of 1000 steps each
- At each step, score whether it gets
better or worse
- continue if it gets better, go
back if it gets worse
- Take the average final structure
- Genetic algorithms use simulated evolution
by natural selection to find optimal solution
- Subject a population of potential solutions
to natural selection, crossover & mutation
- At each generation each individual’s
fitness is assessed and “best” are mated
- Apply crossover & mutation operators
to generate children
- Repeat until reach terminal condition = no improvement
Blood coagulation factor VIII after 0, 10,20,30,40
and 50 generations of selection
Modeling 3-D protein structures
by energy minimization
- compute the energy of each conformation using a specific force field
- energy of covalent bonds
- ionic interactions
- H-bonds
- van der Waals interactions
- hydrophobic interactions
- find the conformation with the lowest energy


Modeling 3-D protein structures by knowledge-based
approaches
Fold Recognition
Basic idea: there are a limited number of “core folds”
that proteins can adopt
Basic approach:
- Find patterns in sequences with known structures
- Use alignment programs to identify these
patterns in unknown sequences
- Use rules to predict how these unknown sequences will fold
Have therefore generated libraries of "core
folds"
Many programs compare unknown sequences with these libraries to see
how well they match
"Threading" is an approach where the algorithm tests how
well an unknown sequence fits each of these possible folds = homology model
building
- sometimes called "reverse protein folding"
since you start by assuming a particular structure, then see how well the
unknown sequence fits this mold
- Takes unknown sequence and threads it through
the coordinates of known structures one residue at a time (Sequence to structure
threading)
- Computes energy of the alignment -> lowest
values = best alignments
- measures the compatibility of the sequence with a structure
- Danish CPHmodels site:Only one I found that
returns results in browser http://www.cbs.dtu.dk/services/CPHmodels/
Homology modeling
- Use multiple alignment to find conserved region
- Construct backbone based on conserved region
- Fill in loops by finding similar sequences
or ab initio
- Complete and correct backbone
- Correct and rebuild side chains
- Check structure quality & packing
- check against Ramachandran plots to make
sure that all f and y
are "legal"
- check for collisions
- check that bond lengths and angles are
"legal"
- Refine using energy minimization
Swiss-Model provides automated homology modeling
http://www.expasy.ch/swissmod/SWISS-MODEL.html
- First checks whether there is anything to
go on = First Approach mode
- If find a suitable template, progress to Optimise
(project) mode
- Evaluate model with Swiss-PdbViewer http://www.expasy.ch/spdbv/mainpage.html
- allows you to edit structure and manually correct regions with
problems
Other sites for homology modeling
LiveBench (http://BioInfo.PL/LiveBench/) and CASP
(http://PredictionCenter.llnl.gov/) are 2 projects designed
to test just how good the modeling programs are!

|