Alpha-helical Topology and Tertiary Structure Prediction ...

55
Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University • Department of Chemical Engineering • Program of Applied and Computational Mathematics • Department of Operations Research and Financial Engineering • Center for Quantitative Biology

Transcript of Alpha-helical Topology and Tertiary Structure Prediction ...

Page 1: Alpha-helical Topology and Tertiary Structure Prediction ...

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins

Scott R. McAllisterChristodoulos A. Floudas

Princeton University

• Department of Chemical Engineering• Program of Applied and Computational Mathematics• Department of Operations Research and Financial Engineering• Center for Quantitative Biology

Page 2: Alpha-helical Topology and Tertiary Structure Prediction ...

OutlineProtein structure prediction overviewPredicting α-helical contacts

Probability developmentModelResults

Predicting α-helical contacts in α/β proteinsDistance boundingModelResults

Structure prediction of α-helical proteinsFrameworkResults

Page 3: Alpha-helical Topology and Tertiary Structure Prediction ...

Protein Structure PredictionProblem

Given an amino acid sequence, identify the three-dimensional protein structure

ApproachesHomology modelingFold recognition/threadingFragment assemblyFirst Principles - Optimization

Statistical potentialsPhysics-based potentials

…TLQAETDQLEDEKSALQ…

?

?

?

Floudas, et al. Chemical Engineering Science. 2006, 61:966-988.Floudas. AIChE Journal. 2005, 51:1872-1884.

Floudas, Biotechnology & Bioengineering, 2007.

Page 4: Alpha-helical Topology and Tertiary Structure Prediction ...

ASTRO-FOLD

Derivation of Restraints-Dihedral angle restrictions-Cα distance constraints

(Reduced Search Space)

Helix Prediction-Detailed atomistic modeling-Simulations of local interactions

(Free Energy Calculations)

Overall 3D Structure Prediction-Structural data from previous stages-Prediction via novel solution approach

(Global Optimization and Molecular Dynamics)

Loop Structure Prediction-Dihedral angle sampling-Discard conformers by clustering

(Novel Clustering Methodology)

β-sheet Prediction-Novel hydrophobic modeling-Predict list of optimal topologies

(Combinatorial Optimization, MILP)

Klepeis, JL and Floudas, CA. Biophys J. (2003)

Page 5: Alpha-helical Topology and Tertiary Structure Prediction ...

OutlineProtein structure prediction overviewPredicting α-helical contacts

Probability developmentModelResults

Predicting α-helical contacts in α/β proteinsDistance boundingModelResults

Structure prediction of α-helical proteinsFrameworkResults

Page 6: Alpha-helical Topology and Tertiary Structure Prediction ...

OverviewProblem

Topology prediction of globular α-helical proteinsApproachThesis: Topology is based on certain Inter-helical Hydrophobic to Hydrophobic Contacts

Create a dataset of helical proteinsDevelop inter-helical contact probabilitiesApply two novel mixed-integer optimization models (MILP)

Level 1 - PRIMARY contactsLevel 2 - WHEEL contacts

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 7: Alpha-helical Topology and Tertiary Structure Prediction ...

Dataset Selection

Protein Sources229 PDBSelect251 database62 CATH2 database20 Zhang et al.3

7 Huang et al.4

RestrictionsNo β-sheets, at least 2 α-helicesNo highly similar sequences

Dataset318 proteins in the database set

1Hobohm, U. and C.Sander. Prot Sci 3 (1994) 522 2Orengo, C.A. et al. Structure 5 (1997) 1093.3Zhang, C. et al. PNAS 99 (2002) 3581.4Huang, E.S. et al. J Mol Biol 290 (1999) 267.

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 8: Alpha-helical Topology and Tertiary Structure Prediction ...

Probability Development

Contact TypesPRIMARY contact

Minimum distance hydrophobic contact between 4.0 Å and 10.0 Å

WHEEL contact Only WHEEL position hydrophobic contacts between 4.0 Å

and 12.0 Å

Classified as parallel or antiparallel contacts

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 9: Alpha-helical Topology and Tertiary Structure Prediction ...

Model OverviewFormulation: Maximize inter-helical

residue-residue contact probabilitiesBinary variable indicates antiparallel

helical contactBinary variable indicates residue

contactGoal: Produce a rank-ordered list of

the most likely helical contactsContacts used to restrict conformational

space explored during protein tertiary structure prediction

An,myh

nmjiw ,

,

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 10: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model Objective

Level 1 ObjectiveMaximize probability of pairwise residue-

residue contacts

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 11: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model Constraints

Level 1 ConstraintsAt most one contact per position

Helix-helix interaction direction

Linking interaction variables

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 12: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model Constraints

Level 1 ConstraintsRestrict number of contacts between a given

helix pair (MAX_CONTACT)

Vary the number of helix-helix interactions (SUBTRACT)

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 13: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model ConstraintsLevel 1 Constraints

Allow for and Limit helical kinks

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 14: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model Constraints

Level 1 ConstraintsConsistent numbering

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

k

i

j

l

Page 15: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model ConstraintsFeasible topologies

m n p

1 1

Page 16: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model Objective

Level 2 ObjectiveMaximize the sum of predicted wheel

probabilities

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 17: Alpha-helical Topology and Tertiary Structure Prediction ...

Pairwise Model Constraints

Level 2 ConstraintsRequire at most one wheel contact for a

specified primary contact

Level 2 AimDistinguish between equally likely Level 1

predictionsIncrease the total number of contact

predictions

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 18: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – 2-3 helix bundles

PDB:1mbh in PyMol PDB:1nre in PyMol

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 19: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – 1nre Contact Predictions

PRIMARY Contact

PRIMARY Distance

WHEEL Contact

WHEEL Distance

Helix-Helix Interaction

25L-49L 6.0 28L-45L 9.1 1-2 A

28L-83V 12.7 - - 1-3 P

45L-85L 9.3 49L-81L 8.1 2-3 A

51I-77L 9.3 - - 2-3 A

subtract 0, max_contact 2

Page 20: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – 1hta Contact Predictions

PRIMARY Contact

PRIMARY Distance

Helix-Helix Interaction

5I-28L 9.1 1-2 A

46L-62L 8.4 2-3 A

subtract 0, max_contact 1

Page 21: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – Contact Prediction Summary

McAllister, Mickus, Klepeis, Floudas. Proteins. 2006, 65:930-952.

Page 22: Alpha-helical Topology and Tertiary Structure Prediction ...

Summary

Thesis: Topology of alpha helical globular proteins is based on inter-helical hydrophobic to hydrophobic contactsValidated on alpha helical globular proteins

Page 23: Alpha-helical Topology and Tertiary Structure Prediction ...

OutlineProtein structure prediction overviewPredicting α-helical contacts

Probability developmentModelResults

Predicting α-helical contacts in α/β proteinsDistance boundingModelResults

Structure prediction of α-helical proteinsFrameworkResults

Page 24: Alpha-helical Topology and Tertiary Structure Prediction ...

OverviewProblem

α-helical topology prediction of globular α/β proteinsApproach

Predict/determine the secondary structure and β-sheet topologyEstablish bounds on inter-residue distancesApply novel optimization model (MILP) to maximize hydropobocity of interhelical interactions

McAllister and Floudas. 2008, In preparation.

Helix Prediction-Detailed atomistic modeling-Simulations of local interactions

(Free Energy Calculations)

β-sheet Prediction-Novel hydrophobic modeling-Predict list of optimal topologies

(Combinatorial Optimization)

Page 25: Alpha-helical Topology and Tertiary Structure Prediction ...

Establishing Distance BoundsApproach

Use secondary structure location and β-sheet topologyDevelop local and non-local bounds (PDBSelect25)Local bounds based on residue separation and secondary structure

Page 26: Alpha-helical Topology and Tertiary Structure Prediction ...

Establishing Distance BoundsNon-local

Extended β-contacts

“Cross” β-contacts

Page 27: Alpha-helical Topology and Tertiary Structure Prediction ...

Tightening Distance BoundsUse of triangle inequality relationships

Model is iteratively applied to determine tightest distance bounds

Page 28: Alpha-helical Topology and Tertiary Structure Prediction ...

Objective FunctionMaximize number of hydrophobic interactions between α-helices and hydrophobicity

α values are weight factors

Number Hydrophobicity

PRIFT scale*

*Cornette et al. J Mol Biol. 1987, 195:659-85.

Page 29: Alpha-helical Topology and Tertiary Structure Prediction ...

ConstraintsResidue contact constraints

Residue i forms at most one contact with residue in helix n

Residue i forms at most two contact

Additional constraints limiting the size of allowed helix kinks

Similar to constraints for α-helical topology prediction of α-helical proteins

Page 30: Alpha-helical Topology and Tertiary Structure Prediction ...

ConstraintsResidue contact constraints

Disallow (i,i+2), (i,i+5), and (i,i+6) residue pairs from both having contacts with helix n

These residues exist on opposite faces of a helix

i

i+2

i+5

i+6

Page 31: Alpha-helical Topology and Tertiary Structure Prediction ...

ConstraintsHelix contact constraints

Maximum of 2 helix-helix contacts for a helix

Only 1 helix-helix interaction direction

Ensure feasible topologiesSimilar to constraints for α-helical topology prediction of α-helical proteins

Page 32: Alpha-helical Topology and Tertiary Structure Prediction ...

ConstraintsRelating residue contacts to helix contacts

Ensure consistent numbering

k

i

j

l

Page 33: Alpha-helical Topology and Tertiary Structure Prediction ...

ConstraintsRelating distances to residue contacts

If residue pair (i,j) forms an inter-helical contact, then dij falls within contact distance

If residue pair (i,j) does not form an inter-helical contact, dij falls beyond contact upper bound

Page 34: Alpha-helical Topology and Tertiary Structure Prediction ...

ConstraintsDistance constraints

Satisfies initial bounds

Satisfies triangle inequality constraints

Page 35: Alpha-helical Topology and Tertiary Structure Prediction ...

ConstraintsDistance constraints

Restrict distances based on right angle interaction assumption

If residue pair (i,k) is an interhelicalcontact, line segment (i,k) is perpendicular to line segment (i,j)Relationship is used to bound the distance djk

Page 36: Alpha-helical Topology and Tertiary Structure Prediction ...

Results1dcjA

1o2fB

Page 37: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – 1bm8

Page 38: Alpha-helical Topology and Tertiary Structure Prediction ...

Results - SummaryBest average contact distance for 11 of 12 proteins in the test set was less than 11.0 Angstroms

Page 39: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – CASP7 T350Prediction with the optimal topology is shown

Page 40: Alpha-helical Topology and Tertiary Structure Prediction ...

OutlineProtein structure prediction overviewPredicting α-helical contacts

Probability developmentModelResults

Predicting α-helical contacts in α/β proteinsDistance boundingModelResults

Structure prediction of α-helical proteinsFrameworkResults

Page 41: Alpha-helical Topology and Tertiary Structure Prediction ...

ASTRO-FOLD for α-helical Bundles

Overall 3D Structure Prediction-Structural data from previous stages-Prediction via novel solution approach

(Global Optimization and Molecular Dynamics)

Derivation of Restraints-Dihedral angle restrictions-Cα distance constraints

(Reduced Search Space)

Helix Prediction-Detailed atomistic modeling-Simulations of local interactions

(Free Energy Calculations)

Loop Structure Prediction-Dihedral angle sampling-Discard conformers by clustering

(Novel Clustering Methodology)

Interhelical Contacts-Maximize common residue pairs-Rank-order list of topologies

(MILP Optimization Model)

McAllister, Floudas. Proceedings, BIOMAT 2005.

Page 42: Alpha-helical Topology and Tertiary Structure Prediction ...

Derivation of RestraintsDihedral angle restraints

For residues with α-helix or β-sheet classificationFor loop residues using the best identified conformer from loop modeling efforts

Distance restraintsHelical hydrogen bond network (i,i+4)α-helical topology predictionsβ-sheet topology predictions

Klepeis, JL and Floudas, CA. Journal of Global Optimization. (2003)

Page 43: Alpha-helical Topology and Tertiary Structure Prediction ...

Constrained optimizationProblem definition

Atomistic level force field (ECEPP/3)

Distance constraints

Page 44: Alpha-helical Topology and Tertiary Structure Prediction ...

Tertiary Structure PredictionHybrid global optimization approachαBB deterministic global optimizationConformational Space Annealing (CSA)

Modifications/EnhancementsImproved initial point selection using a torsion angle dynamics based annealing procedure from CYANAInclusion of a rotamer optimization stage for quick energetic improvementsStreamlined parallel implementation

Page 45: Alpha-helical Topology and Tertiary Structure Prediction ...

αBB Global OptimizationBased on a branch-and-bound frameworkUpper bound on the global solution is obtained by

solving the full nonconvex problem to local optimalityLower bound is determined by solving a valid convex

underestimation of the original problemConvergence is obtained by successive subdivision of

the region at each level in the brand & bound treeGuaranteed ε-convergence for C2 NLPs

Adjiman, CS, et al. Computers and Chemical Engineering. (1998a,b)Floudas, CA and co-workers, 1995-2007

Page 46: Alpha-helical Topology and Tertiary Structure Prediction ...

Conformational Space AnnealingInduce variations

MutationsCrossovers

Subject to local energy minimizationAnneal through the gradual reduction of space

Lee, JH, et al. Journal of Computational Chemistry. (1997)Scheraga and co-workers, 1997-2007.

Page 47: Alpha-helical Topology and Tertiary Structure Prediction ...

Rotamer Side Chain OptimizationSide chain packing is crucial to the stability and specificity of the native stateRotamer optimization is a quick way to alleviate steric clashesBetter starting point for constrained nonlinear minimization

Page 48: Alpha-helical Topology and Tertiary Structure Prediction ...

Torsion Angle DynamicsWhy? Difficult to identify low energy feasible pointsFast evaluation of steric based force fieldUnconstrained formulation with penalty functions

Implemented by solving equations of motion as preprocessing for each constrained minimization

Guntert, P, et al. Journal of Molecular Biology. (1997)Klepeis, JL, et al. Journal of Computational Chemistry. (1999)Klepeis, JL and Floudas, CA. Computers and Chemical Engineering. (2000)

Page 49: Alpha-helical Topology and Tertiary Structure Prediction ...

Hybrid Global Optimization AlgorithmAll secondary nodes begin performing αBB iterationsOnce the CSA bank is full, CSA takes control of a subset of secondary nodes

αBB Control CSA ControlPrimaryprocessor

Secondaryprocessors

Idle Work

αBB control•Maintains list of lower bounding subregions•Tracks overall upper and lower bounds•Defines branching directions•Sends and receives work to and from αBB work nodes

CSA control•Maintains CSA bank•Maintains queue of αBB minima for bank increases•Handles bank updates•Sends and receives work to and from CSA work nodes

Idle work•Performs shear movements and perturbations on CSA structures•Only executed during idle time of primary processor

αBB work•Torsion angle dynamics•Rotamer optimization•Minimization of lower bounding function•Minimization of upper bounding function

CSA work•Rotamer optimization•Minimization of CSA trial conformation

McAllister and Floudas. 2007, Submitted for publication.

Page 50: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – Tertiary Structure Prediction

PDB: 1nre

Lowest energy predicted structure of 1nre (color) versus native 1nre (gray)

Lowest RMSD predicted structure of 1nre (color) versus native 1nre (gray)

Energy -1395.48RMSD 6.63

Energy -1340.45RMSD 3.52

Page 51: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – Tertiary Structure Prediction

PDB: 1hta

Lowest energy predicted structure of 1hta (color) versus native 1hta (gray)

Lowest RMSD predicted structure of 1hta (color) versus native 1hta (gray)

Energy -941.02RMSD 6.70

Energy -915.57RMSD 2.58

Page 52: Alpha-helical Topology and Tertiary Structure Prediction ...

Results – Blind Tertiary Structure Prediction(Collaboration with Michael Hecht)

S836

Lowest energy predicted structure of s836 (color) versus native s836 (gray)

Lowest RMSD predicted structure of s836 (color) versus native s836 (gray)

Energy -1740.11RMSD 2.84

Energy –1697.88RMSD 2.39

Page 53: Alpha-helical Topology and Tertiary Structure Prediction ...

ConclusionsTwo novel mixed-integer linear programming

models were developed for α-helical topology prediction in α-helical proteins

PRIMARY and WHEEL contactsFor all 26 test α-helical proteins, best average

contact distance predictions fell well below 11.0 ÅA novel mixed-integer linear programming model

was aslo developed for α-helical topology prediction in α/β proteinsFor 11 of 12 test α/β proteins, best average contact

distance predictions fell below 11.0 ÅTopology predictions were useful for restraining the

tertiary structures during global optimization and obtaining a near-native predictions in a blind study

Page 54: Alpha-helical Topology and Tertiary Structure Prediction ...

AcknowledgementsFunding sources

National Institutes of Health (R01 GM52032)US EPA (GAD R 832721-010)*

*Disclaimer: This work has not been reviewed by and does not represent the opinions of the funding agency.

Page 55: Alpha-helical Topology and Tertiary Structure Prediction ...

Questions