| Home
syllabus
general information
homework
lectures
websites
Will Terzaghi's Homepage
Membership
Login |
|
|
|
Week 10 lecture
Visualizing 3-D molecular structures
The results of X-ray crystallography, NMR, computer modeling,
etc are all recorded as files that list the sequence of the
molecule and the coordinates of each atom relative to an arbitrary origin near
the center of the molecule.
The sequence is very important,
as it allows us to decide which amino acid each atom belongs to, which neighboring
atoms it is bonded to and what sorts of bonds are formed.
Structural biochemists sometimes
call the sequence the “chemical
graph” of a molecule
Two basic approaches are used to
record 3-D data:
One approach, such as that used
by PDB files, is to list the amino acid sequence using the 3 letter
code, then give the coordinates for each atom. Some files (such as
the example below) also provide information about secondary structure (lines
9-13) and about special features such as disulfide bonds (lines 14-16)
- The amino acid sequence is called
the explicit sequence since it identifies the order of amino acids
- The list of atoms and their coordinates
is called the implicit sequence, since you can deduce who is bonded to who
using "chemistry rules"
- PDB files do not store bonds; instead, they
use “chemistry rules” to infer bonds
- for example, 2 carbons 1.5 Å apart are joined by a single
bond
- programs which interpret PDB files to render
3-D images reconstruct bonds by consulting tables of bond lengths and types
for every conceivable pair of bonded atoms.
- Problems
- must record each exception to the rules
and deal with it on a case-by-case basis
- many structures are incomplete: i.e. are
missing coordinates for some atoms.
snippet from a PDB file (I edited out ~40 lines
of comments)
- Lines 1-4 identify the compound and the source and list the authors
- Next there were many lines of comments which
I have edited out
- The seqres lines (lines 5-8) give the primary
sequence of the protein
- lines 9-13 provide informatio about secondary
structure
- Lines 14-16 provide information about disulfide
bonds.
- Lines 17-23 provide information about the
crystal, the coordinates used and the scale used to convert from the actual
coordinates provided by the crystallographer to fractional coordinates (ranging
from -1 to + 1)
- The coordinates for each atom start on line
24
- column 1 lists the name of the record
- column 2 lists serial number of the atom
- column 3 lists the name of the atom
- column 4 lists the name of the amino acid
it is attached to
- column 5 lists the residue sequence number
- column 6 lists the X- coordinate
- column 7 lists the Y- coordinate
- column 8 lists the Z- coordinate
- column 9 lists the occupancy of that position
- column 10 lists
the temperature factor; a measure of the confidence in the position of
the atom
- Additional columns may be used to designate
the charge on an atom, the elemental symbol, which chain of the molecule
this is (when dealing with proteins that have multiple subunits) and for
other information
mmCIF (macromolecular Chemical Interchange Format) files
store data in a similar format, but use a different and more complex set of
relational tables
- Software is available for converting PDB files
to mCIF format, and for converting mmCIF to "pseudoPDB" format
- Many other file formats also use this general
approach of supplying a list of coordinates for each atom in a molecule
- These formats can generally be converted to PDB format
The Molecular Modeling Data Base
(MMDB) at NCBI uses a different approach to store 3-D data.
- MMDB uses a "standard residue dictionary,"
a record of all the atoms and bonds in proteins and nucleic acids as well
as variants found at the front and back ends of proteins and nucleic acids.
- Software that reads MMDB files uses this dictionary
to connect atoms
- works much faster, since doesn't need
to calculate the rules of chemistry
- Is also more consistent, since doesn't
need to interpret rules, and doesn't need to deal with exceptions to the
rules (exceptions are included in the dictionary)
Software for rendering 3D images
from PDB files
- General problem: interpreting the molecular coordinates and creating
a 3 D image that can be rotated in all directions around the center point
- many programs also allow you to present
the image in various formats such as backbone, space-fill, cartoon, etc.
- many allow you to color the image according
to various criteria
- many allow you to highlight specific portions
of the molecule according to user-specified criteria
- Many programs have been written
for this purpose.
- One of the most widely used is RASMOL
- Rasmol reads molecular coordinates from a variety of file types
such as PDB, CHARMm, MOL and renders a 3-D image
- This image can be displayed in a variety
of modes, including backbone, ball-and-stick, spacefill and cartoon (for
illustrating secondary structure)
- This image can be displayed in a variety
of color schemes to highlight features such as secondary structure, temperature
(the confidence in the position of that atom), groups, etc
- The image can be rotated to view from
any angle
- The image can be printed or exported in
a variety of formats
- There are a variety of sophisticated commands
that allow you to look at specific parts of the molecule and display specific
features
- Unfortunately, these require that
you type in specific commands. Therefore, learning to take full advantage
of RasMol's capabilities takes quite a while
- Rasmol can only deal with one image
at a time
- Several different programs have modified or
enhanced Rasmol
- CHIME is a web-browser plugin based on
RasMol that allows you to render molecular structures within your browser.
It also allows you to rotate them, changes their appearance and color
scheme in the same way as Rasmol.
- Protein Explorer is an improved version
based on Chime and Rasmol that is more powerful and more user-friendly.
- Features for highlighting particular
regions or particular interactions are now available by pointing and
clicking.
- Animations can be played
- Evolution of a molecule can be followed
using the ConSurf server
- Two molecules can be viewed and compared
at the same time.
- Swiss-PDB viewer is another powerful program
for rendering PDB files
- It is designed to be used for visualizing
and editing the output from SWISS_MODEL, so it contains many powerful
features for evaluating portions of the structure and modifying it (and
then evaluating the modifications)
- You can perform interactive Ramachandran-Plot-manipulation
- You can perform energy minimization
on selected portions of the model
- Everything can be selected from lists
and menus. Not as user-friendly as Protein Explorer, but easier than RasMol
- Only can see wire-frame mode on-screen,
but can export images to POVray or Quick-draw 3-D for high-qulity images.
- Many Unix command-line programs are available
CN3D uses MMDB files
Modeling protein-ligand interactions
- General problem: trying to model
how proteins interact with other molecules, and
how they change shape upon binding.
- Ligand Docking algorithms model the interactions
between a protein of known structure and its ligand (which may be another
protein)
- This is conceptually similar to modeling
protein folding, and similar approaches are used
- Both approaches try to find the conformation
with the lowest free energy
- These procedures start by identifying the binding site within the
protein
- One way to do this is to obtain protein
crystals with a substrate or inhibitor bound in the active site.
- If such crystals are not available, algorithms
scan for likely binding sites on the protein
- Alternatively, infer the binding site
by comparing the structures of compounds that are known to interact with
the same protein
- Next step is simulating the annealing process
1) Lock and key approaches assume a rigid
protein and flexible ligand
- hold protein shape constant, and alter
shape of ligand
- Use energy minimization, Monte Carlo simulations,genetic algorithms,
etc to estimate changes in shape.
- Advantage: simpler programming
2) Induced-fit algorithms allow both the
protein and the ligand to change shape
- many start by holding protein shape
constant, and alter shape of ligand
- Then use energy minimization, etc to try to estimate change
in protein shape.
Rational Drug Design:applied protein-ligand
interactions
Drug design is a 2 step process
- Drug discovery
- Drug testing
Bioinformatics can't do much (yet) to speed
up drug testing, but it can significantly accelerate drug discovery
- Accelerate target identification
- metabolic modeling allows us to identify
molecules that are essential for infection or proliferation of a pathogen
(we will come back to this topic later) and estimate the effect of inhibiting
this target. We can also use metabolic modeling to identify the best target in a signalling or metabolic pathway and simulate the effect of inhibiting it
- Rational drug design
- design drugs that will specifically
bind to the active site (or regulatory sites) within a target protein
General approach
- Identify suitable target molecule
- Obtain structure of this molecule
- Identify the binding site within this molecule
- Identify the pharmacophore: the atoms that
form the important 3D points of interaction between a drug and its target
needed to elicit a response.
- Design drugs that will match the features
of a pharmacophore and, therefore, fit into the binding site.
We will come back to the question of metabolic
simulation later in the course, and we have already dealt with ways to obtain
the structure and identify the binding site within the target.
- One way to identify the pharmacophore is to
dock inhibitors and obtain X-ray crystallography or NMR structures
- This requires very high resolution images!
- An alternative is to identify a series of
compounds acting via the same target and then see what structures they all
have in common
- for an example of this approach see
http://dtp.nci.nih.gov/docs/3d_database/pharms/pkcsearch.html
- used phorbol esters to identify the pharmacophore
for Protein Kinase C
- Many pharmaceutical companies are using
combinatorial chemistry to make libraries of chemicals
- Use high-throughput screening to identify
lead chemicals: proteins which show some effect on the target.
- Once leads have been identified use
simulations to identify potential pharmacophores and to eliminate
chemicals which overlap
http://www.netsci.org/Science/Combichem/feature05.html
- An alternative once the active site on
the target proteins has been identified is to screen databases of molecular
structures for potential ligands
- Many databases of structures for small
molecules are available
- Many programs have been developed
for rapidly screening these databases for potential ligands using
various heuristic approaches
- FLEXX allows you to search online
(once you are registered).
- SLIDE is another algorithm that
allows you to rapidly screen large data bases of potential ligands.
- Accelrys markets a suite of programs
for rational drug design
- Once the pharmacophore has been identified,
many algorithms have been developed to help design better drugs
- Goals
- designing drugs that will bind to
regions of target proteins that do not change in pathogens such as
viruses that evade the immune system by rapidly mutating
- designing drugs that bind with tailored
affinity constants
- sometimes reversible binding is
better than irreversible binding!
- Chem-X uses defined centers (hydrogen
bond acceptor, hydrogen bond donor, positive charge, aromatic, hydrophobic,
acid, base) and defined distance intervals to create a set of pharmacophores
- Many other companies (eg accelrys) offer
programs that allow you to design superior drugs once you have identified
the binding site and pharmacophore

|