bioinformatics

 
Home

syllabus

general information

homework

lectures

websites

Will Terzaghi's Homepage



Membership

Login

 
 

Week 13 lecture

Previous topic: Next topic:
inactiveTopic Week 13 lecture topic started 10/14/03; 6:50:32 PM
last post 10/14/03; 6:50:32 PM
user William Terzaghi - Week 13 lecture  blueArrow
10/14/03; 6:50:32 PM (reads: 2694, responses: 0)

Proteomics & Metabolomics

Proteomics: studying all of the proteins present in a particular organism

  • A very hot topic in the post-genomic era!
    Now that we have the genome, what do we do with it?

The old-fashioned way was to prepare 2-dimensional gels of proteins prepared from the organisms or tissues being studied

  • first use isoelectric focusing (usually in a tube gel) to separate proteins by pI
  • Then use SDS-PAGE to separate by size
  • identify spots by excising each one, then determining its identity
    • If it is highly abundant, this can be done by sequencing the protein by Edman degradation
      • partial sequence data may be sufficient to confirm its identity
    • Usually each spot is broken into peptides using enzymes (or other reagents) that cleave at specific sites, then use MALDI (matrix assisted laser desorption ionization) or ESI (Electrospray ionization) to ionize peptides so they can be analyzed by mass spectrometry.
      • result is a peptide mass fingerprint: a series of fragments of known size
      • determine the identity by comparing the fingerprint with a library of theoretical mass spectra using software such as PeptIdent (http://us.expasy.org/tools/peptident.html) or ProFound (http://prowl.rockefeller.edu/cgi-bin/ProFound)
        • once a genome has been sequenced, can predict the sequence of each protein and the fragments that will be created when each one is cleaved with commonly-used reagents
  • Goal is to identify all the spots present!
    • problems include
      • post-translational modifications that affect mass such as glycosylation, phosphorylation, proteolytic processing, etc
      • limits of detection: many proteins are too rare to detect and identify by these means, yet are vitally important (e.g. the lac repressor in E.coli is only present at about 15 copies/cell

Another valuable use of 2-D gels is to superimpose gels from different tissues or treatments in order to identify spots that increased or decreased in intensity

  • databases of 2-D gels are stored online at many locations
  • software such as Melanie ( http://us.expasy.org/melanie/) allows you to superimpose 2D gels to identify and quantitate spots, calculate mass and pI, measure differences between treatments, etc.
    • problem is similar to superimposing the 2 scans of a microarray, except also need to account for differences in the way the gel ran
    • do this using a series of markers and internal controls

Recently (as we saw in our last homework) many workers have attempted to develop protein chips

  • attach probes to chips at fixed locations
    • antibodies (to detect specific kinds of proteins)
    • specific proteins (to detect protein:protein interactions)
    • specific DNA sequences (to detect transcription factors or other proteins that bind to specific DNA sequences)
    • specific RNA sequences (to detect RNA-binding proteins)
    • other kinds of chemicals
  • Protein extracts are then fluorescently-labeled and washed over the chip
  • After washing off unbound targets the chip is scanned, then data is analyzed as for a microarray
  • Problems:
    • proteins must retain their 3-D structure and function for many (most?) of the procedures
      • designing protocols that will retain the activity of all the proteins in an extract is challenging!
    • proteins may interact with multiple targets
      • rules for protein/ligand interactions are more complex than for DNA:DNA or DNA:RNA interactions
    • binding kinetics and affinities differ between proteins, and are more difficult to standardize than hybridization times

Many other tools are being developed to study the proteome

  • Many groups (and companies) are developing techniques to study proteomes using liquid chromatography-based approaches, e.g. multiple stage HPLC
  • The two hybrid system is being used to identify protein-protein interactions on a genome-wide scale
  • expression libraries are being developed to allow each protein to be purified in useful amounts

Making sense of the proteome

One level is trying to link it with metabolism

Many applications have been written for modeling enzyme kinetics; eg. in order to calculate Km, Vmax, binding constants etc.

Several sites have compiled databases of enzymes and everything known about them

Online metabolic pathways charts are available at many sites, e.g. http://www.expasy.ch/cgi-bin/search-biochem-index

Several sites have developed interactive metabolic pathways

  • these allow you to enter an enzyme and find out which pathways it is part of
  • alternatively, you can query by pathway, substrate, cofactor, etc.
  • EMP (enzymes and metabolic pathways database) allows you to query by enzyme, pathway, cellular location, or by any step in the pathway from initial substate to final product. http://emp.mcs.anl.gov/
    KEGG and WIT allow you to perform similar queries, but also allow you to compare pathways between organisms

KEGG (Kyoto encyclopedia of genes and genomes) is intended to allow you to go from a genome to understanding the metabolic pathways that it encodes.

  • You can query using raw DNA sequence, then find the gene, the proteinit encodes and the pathway(s) it participates in http://www.genome.ad.jp/kegg/

WIT (What is There?) is designed to go a step further, and allow you to perform interactive metabolic reconstruction on the web. http://wit.mcs.anl.gov/WIT2/

  • Like KEGG, you can go from a DNA sequence to identifying the pathways the enzyme it encodes participates in.
  • You can also query which organisms perform a particular transformation
  • You can also ask whether there are alternative ways to perform a particular transformation, and which organisms do it
  • You can search for clusters of orthologous genes in various organisms, and look for connected functions
    • perhaps a gene which doesn't make sense in organism X makes perfect sense in organism Y, because there it has been placed in a metabolic pathway

PathDB does this and a bit more. It allows you to discover pathways: i.e.explore all ways to get from metabolite A to metabolite B
http://www.ncgr.org/software/pathdb/

Metabolomics

Metabolomics studies the metabolome:the sum of all small molecular weight metabolites inside a cell.http://gepasi.dbs.aber.ac.uk/dbk/metabol.htm

  • requires a combination of techniques, including HPLC, GC-mass spectrometry, ESI-mass spectrometry, etc.
  • use these techniques to inventory and quantitate all the metabolites inside a cell
    • allows you to study the regulation of entire pathways (or more)
  • studied using Metabolic Control Analysis
    • allows you to infer the effect of each enzyme on flux through the pathway: its control coefficient
    • allows you to predict the effect of increasing or decreasing its activity on the cell
  • Many groups have now jumped on this bandwagon and are performing their own metabolomic research

Gepasi (General Pathway Simulator) is a program that allows you to simulate effects of changing various conditions on flux through a pathway http://gepasi.dbs.aber.ac.uk/softw/gepasi.html

Work is now underway to develop a successor to Gepasi called COPASI, but it isn't yet available to the public http://www.vbi.vt.edu/research/projects/resproj_mendes_copasi.htm

(The lead researcher is presently looking for graduate students to work on this project

https://www.vbi.vt.edu/article/articleview/104/1/49/)

 





Last update: Tuesday, October 14, 2003 at 6:50:32 PM.