Protein Tertiary Structure Prediction Structural Bioinformatics.

38
Protein Tertiary Structure Prediction Structural Bioinformatics
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    231
  • download

    3

Transcript of Protein Tertiary Structure Prediction Structural Bioinformatics.

Page 1: Protein Tertiary Structure Prediction Structural Bioinformatics.

Protein Tertiary Structure Prediction

Structural Bioinformatics

Page 2: Protein Tertiary Structure Prediction Structural Bioinformatics.

Primary: amino acid linear sequence.

Secondary: -helices, β-sheets and loops.

Tertiary: the 3D shape of the fully folded polypeptide chain

The Different levels of Protein Structure

Page 3: Protein Tertiary Structure Prediction Structural Bioinformatics.

Predicting 3D Structure

– Comparative modeling (homology)

Based on structural homology

– Fold recognition (threading)

Outstanding difficult problem

Based on sequence homology

Page 4: Protein Tertiary Structure Prediction Structural Bioinformatics.

Comparative ModelingSimilar sequences suggests similar structure

Page 5: Protein Tertiary Structure Prediction Structural Bioinformatics.

Sequence and Structure alignments of two Retinol Binding Protein

Page 6: Protein Tertiary Structure Prediction Structural Bioinformatics.

Structure Alignments

The outputs of a structural alignment are a superposition of the atomic coordinates and a minimal Root Mean Square Distance (RMSD) between the structures. The RMSD of two aligned structures indicates their divergence from one another.

Low values of RMSD mean similar structures

There are many different algorithms for structural Alignment.

Page 7: Protein Tertiary Structure Prediction Structural Bioinformatics.

Comparative Modeling

Builds a protein structure model based on its alignment to one or more related protein structures in the database

Similar sequence suggests similar structure

Page 8: Protein Tertiary Structure Prediction Structural Bioinformatics.

Comparative Modeling• Accuracy of the comparative model is

related to the sequence identity on which it is based

>50% sequence identity = high accuracy

30%-50% sequence identity= 90% modeled

<30% sequence identity =low accuracy (many errors)

Page 9: Protein Tertiary Structure Prediction Structural Bioinformatics.

Homology Threshold for Different Alignment Lengths

0

10

20

30

40

50

60

70

80

90

0 20 40 60 80 100

Alignment length (L)

Homology Threshold (t)

A sequence alignment between two proteins is considered to imply structural homology if the sequence identity is equal to or above the homology threshold t in a sequence region of a given length L.

The threshold values t(L) are derived from PDB

Page 10: Protein Tertiary Structure Prediction Structural Bioinformatics.

Comparative Modeling

• Similarity particularly high in core– Alpha helices and beta sheets preserved– Even near-identical sequences vary in loops

Page 11: Protein Tertiary Structure Prediction Structural Bioinformatics.

Comparative Modeling Methods

MODELLER (Sali –Rockefeller/UCSF)

SCWRL (Dunbrack- UCSF )

SWISS-MODEL http://swissmodel.expasy.org//SWISS-MODEL.html

Page 12: Protein Tertiary Structure Prediction Structural Bioinformatics.

Comparative ModelingModeling of a sequence based on known structuresConsist of four major steps :1. Finding a known structure(s) related to the sequence

to be modeled (template), using sequence comparison methods such as PSI-BLAST

2. Aligning sequence with the templates

3. Building a model

4. Assessing the model

Page 13: Protein Tertiary Structure Prediction Structural Bioinformatics.

Fold Recognition

Page 14: Protein Tertiary Structure Prediction Structural Bioinformatics.

Hemoglobin TIM

Protein Folds: sequential and spatial arrangement of secondary structures

Page 15: Protein Tertiary Structure Prediction Structural Bioinformatics.

Similar folds usually mean similar function

Homeodomain Transcriptionfactors

Page 16: Protein Tertiary Structure Prediction Structural Bioinformatics.

The same fold can have multiple functions

Rossmann

TIM barrel

12 functions

31 functions

Page 17: Protein Tertiary Structure Prediction Structural Bioinformatics.

Fold Recognition

• Methods of protein fold recognition attempt to detect similarities between protein 3D structure that have no significant sequence similarity.

• Search for folds that are compatible with a particular sequence.

• "the turn the protein folding problem on it's head” rather than predicting how a sequence will fold, they predict how well a fold will fit a sequence

Page 18: Protein Tertiary Structure Prediction Structural Bioinformatics.

Basic steps in Fold Recognition :

Compare sequence against a Library of all known Protein Folds (finite number)

Query sequenceQuery sequence

MTYGFRIPLNCERWGHKLSTVILKRP...

Goal: find to what folding template the sequence fits best

There are different ways to evaluate sequence-structure fit

Page 19: Protein Tertiary Structure Prediction Structural Bioinformatics.

MAHFPGFGQSLLFGYPVYVFGD...

Potential fold

...

1) ... 56) ... n)

...

-10 ... -123 ... 20.5

There are different ways to evaluate sequence-structure fit

Page 20: Protein Tertiary Structure Prediction Structural Bioinformatics.

Programs for fold recognition

• TOPITS (Rost 1995)

• GenTHREADER (Jones 1999)

• SAMT02 (UCSC HMM)

• 3D-PSSM http://www.sbg.bio.ic.ac.uk/~3dpssm/

Page 21: Protein Tertiary Structure Prediction Structural Bioinformatics.

Ab Initio Modeling

• Compute molecular structure from laws of physics and chemistry alone Theoretically Ideal solution

Practically nearly impossible

WHY ?– Exceptionally complex calculations– Biophysics understanding incomplete

Page 22: Protein Tertiary Structure Prediction Structural Bioinformatics.

Ab Initio Methods

• Rosetta (Bakers lab, Seattle)

• Undertaker (Karplus, UCSC)

Page 23: Protein Tertiary Structure Prediction Structural Bioinformatics.

CASP - Critical Assessment of Structure Prediction

• Competition among different groups for resolving the 3D structure of proteins that are about to be solved experimentally.

• Current state -– ab-initio - the worst, but greatly improved in the last

years. – Modeling - performs very well when homologous

sequences with known structures exist.– Fold recognition - performs well.

Page 24: Protein Tertiary Structure Prediction Structural Bioinformatics.

What can you do?FOLDIT

Solve Puzzles for Science

A computer game to fold proteins

http://fold.it/portal/puzzles

Page 25: Protein Tertiary Structure Prediction Structural Bioinformatics.

What’s Next

Predicting function from structure

Page 26: Protein Tertiary Structure Prediction Structural Bioinformatics.

Structural Genomics : a large scale structure determination project designed to cover all

representative protein structures

Zarembinski, et al., Proc.Nat.Acad.Sci.USA, 99:15189 (1998)

ATP binding domain of protein MJ0577

Page 27: Protein Tertiary Structure Prediction Structural Bioinformatics.

As a result of the Structure Genomic initiative many structures of proteins with unknown function will be solved

Wanted !Automated methods to predict function from the protein structures resulting from the structural genomic project.

Page 28: Protein Tertiary Structure Prediction Structural Bioinformatics.

Approaches for predicting function from structure

ConSurf - Mapping the evolution conservation on the protein structure http://consurf.tau.ac.il/

Page 29: Protein Tertiary Structure Prediction Structural Bioinformatics.

Approaches for predicting function from structure

PFPlus – Identifying positive electrostatic patches on the protein structure http://pfp.technion.ac.il/

Page 30: Protein Tertiary Structure Prediction Structural Bioinformatics.

A method to distinguish DNA from RNA-binding proteins

DNA binding interface RNA binding interface

Page 31: Protein Tertiary Structure Prediction Structural Bioinformatics.

DNA binding interface RNA binding interface

RNA and DNA binding interfaces tend to have different geometric features

Page 32: Protein Tertiary Structure Prediction Structural Bioinformatics.

H=(k1+k2)/2 Mean Curvature

K=k1*k2 Gaussian Curvature

Applying Differential Geometry to characterize DNA and RNA binding proteins

K1 - MINIMAL CURVATURE

K2- MAXIMAL CURVATURE

Page 33: Protein Tertiary Structure Prediction Structural Bioinformatics.

Peak

Pit

Ridge

Valley

Flat

Minimal Surface

Saddle ridge

Saddle valley

Applying Differential Geometry to characterize DNA and RNA proteins

Page 34: Protein Tertiary Structure Prediction Structural Bioinformatics.

Applying Differential Geometry for DNA and RNA function prediction

Fre

quen

cy o

f po

ints

Page 35: Protein Tertiary Structure Prediction Structural Bioinformatics.

RNA binding surfaces are distinguishedfrom DNA binding surfaces based on

Differential Geometric features

78% DNA binding76% RNA-binding

Page 36: Protein Tertiary Structure Prediction Structural Bioinformatics.

Differential Geometry can correctly determinewhether a given binding domain binds

RNA or DNA

RNA pattern DNA pattern

Fre

quen

cy o

f po

ints

Shazman et al, NAR 2011

Page 37: Protein Tertiary Structure Prediction Structural Bioinformatics.

How can we view the protein structure ?

• Download the coordinates of the structure from the PDB http://www.rcsb.org/pdb/

• Launch a 3D viewer program For example we will use the program Pymol The program can be downloaded freely from the Pymol homepage http://pymol.org

• Upload the coordinates to the viewer

Page 38: Protein Tertiary Structure Prediction Structural Bioinformatics.

Pymol example• Launch Pymol• Open file “1aqb” (PDB coordinate file)• Display sequence• Hide everything• Show main chain / hide main chain• Show cartoon • Color by ss• Color red• Color green, resi 1:40

Help : http://pymol.org