bioinformatics

 
Home

syllabus

general information

homework

lectures

websites

Will Terzaghi's Homepage



Membership

Login

 
 

week 9 lecture

Modeling 3-D protein structures

Most protein structures have been obtained using X-ray crystallography (for information go to http://www.rcsb.org/pdb/experimental_methods.html)

  • proteins must be able to form crystals
    • problem for membrane proteins, many others
  • alternatives
    • solution NMR
    • high resolution electron microscopy
    • circular dichroism

All these methods are time-consuming, very technical and require sophisticated equipment and highly trained personnel. Therefore, there is a great deal of interest in developing algorithms for modeling 3-D protein structures

Start by modeling secondary structures, using approaches described last week, since proteins fold in a stepwise manner due to interactions with water and with each other

Next we try to predict which portions will be at surface and which will be buried in center, based on hydrophobic/hydrophilic interactions and similarity with proteins of known structure

Other programs look for similarity with known motifs and domains.

Now we're ready to model 3-D structure!

2 General Approaches

  1. Ab initio: derive structure from models based on principles of protein folding
  2. Knowledge-based

Ab initio

basic problem: each residue has at least 3 distinct possible conformations w.r.t. secondary structure

  • side chains also have many possible conformations
  • modeling all possibilities takes too long
    • it took 2 months on 2 CRAYs to “fold” a 36-mer
  • Can speed up the process by limiting the search space, just as FASTA speeds up sequence alignments by limiting the search space
    • one way to limit the search space is by only looking at the conformations that amino acids are capable of adopting within a protein
      • peptide bonds are rigid, but rotation is permitted between the nitrogen and the alpha carbon, and between the alpha and carbonyl carbon atoms
      • the angle of rotation between the nitrogen and alpha-carbon is called f
      • the angle of rotation between the alpha and carbonyl carbons is called y
      • only certain combinations of f and y are permitted because of collisions between the R group of one amino acid and the backbone of the next

    rotation:

  • Ramachandran plots describe the possible f and y values for proteins
    • note that 75% of f and y combinations are excluded because of collisions!
    • therefore, limiting the search to allowable combinations of f and y dramatically accelerates calculations
    • Twist and turn the residues within allowed limits to find most likely conformation
    ramachandran:

Three types of approaches are used for ab initio modeling

  1. Deterministic methods use quantum mechanics and diffusion equations to predict most likely conformation
  2. Energy minimization tries to model the energetics of the interactions between the amino acids and the solvent in order to predict the structure with the lowest energy
  3. Stochastic searches test various folds and pick most likely
    • Monte Carlo simulated annealing sequentially moves each of the elements by a random step of a randomly chosen coordinate,
      • Use a collection of different starting positions, compute 50 different trajectories of 1000 steps each
      • At each step, score whether it gets better or worse
        • continue if it gets better, go back if it gets worse
      • Take the average final structure
    • Genetic algorithms use simulated evolution by natural selection to find optimal solution
    flowchart:
    • Subject a population of potential solutions to natural selection, crossover & mutation
    • At each generation each individual’s fitness is assessed and “best” are mated
    • Apply crossover & mutation operators to generate children
    • Repeat until reach terminal condition = no improvement
    children:
example:

Blood coagulation factor VIII after 0, 10,20,30,40 and 50 generations of selection

Modeling 3-D protein structures by energy minimization

  • compute the energy of each conformation using a specific force field
    • energy of covalent bonds
    • ionic interactions
    • H-bonds
    • van der Waals interactions
    • hydrophobic interactions
  • find the conformation with the lowest energy
energy function:

energy plot 1:

energy plot 2:

Modeling 3-D protein structures by knowledge-based approaches

Fold Recognition

Basic idea: there are a limited number of “core folds” that proteins can adopt

Basic approach:

  • Find patterns in sequences with known structures
  • Use alignment programs to identify these patterns in unknown sequences
  • Use rules to predict how these unknown sequences will fold

Have therefore generated libraries of "core folds"

Many programs compare unknown sequences with these libraries to see how well they match

"Threading" is an approach where the algorithm tests how well an unknown sequence fits each of these possible folds = homology model building

  • sometimes called "reverse protein folding" since you start by assuming a particular structure, then see how well the unknown sequence fits this mold
  • Takes unknown sequence and threads it through the coordinates of known structures one residue at a time (Sequence to structure threading)
  • Computes energy of the alignment -> lowest values = best alignments
  • measures the compatibility of the sequence with a structure
  • Danish CPHmodels site:Only one I found that returns results in browser http://www.cbs.dtu.dk/services/CPHmodels/

Homology modeling

  • Use multiple alignment to find conserved region
  • Construct backbone based on conserved region
  • Fill in loops by finding similar sequences or ab initio
  • Complete and correct backbone
  • Correct and rebuild side chains
  • Check structure quality & packing
    • check against Ramachandran plots to make sure that all f and y are "legal"
    • check for collisions
    • check that bond lengths and angles are "legal"
  • Refine using energy minimization

Swiss-Model provides automated homology modeling http://www.expasy.ch/swissmod/SWISS-MODEL.html

  • First checks whether there is anything to go on = First Approach mode
  • If find a suitable template, progress to Optimise (project) mode
  • Evaluate model with Swiss-PdbViewer http://www.expasy.ch/spdbv/mainpage.html
    • allows you to edit structure and manually correct regions with problems

Other sites for homology modeling

LiveBench (http://BioInfo.PL/LiveBench/) and CASP (http://PredictionCenter.llnl.gov/) are 2 projects designed to test just how good the modeling programs are!




Last update: Sunday, March 16, 2003 at 11:01:01 PM.