Enhanced sampling techniques and their limitations

20
Enhanced sampling techniques and their limitations Travis Hughes Associate professor Department of Biomedical and Pharmaceutical Sciences. University of Montana

Transcript of Enhanced sampling techniques and their limitations

Enhanced sampling techniques and their limitations

Travis HughesAssociate professor

Department of Biomedical and Pharmaceutical Sciences.University of Montana

Proteins possess thermal movements, distinct conformations (conformational ensemble) arise from thermal energy.

Nature 450:964-972 2007

Boltzmann distribution

Ni=number of molecules in conformation iN=total number of moleculesεi=energy of conformation ibottom part is the partition function (sum overall conformations).Ni/N=fraction of molecules below a certain energy

http://www.nature.com/articles/s41467-019-13768-0

Why enhanced sampling?• The best computers/clusters can

simulate ~1μs/day of a moderate size protein in explicit solvent .

Why enhanced sampling?

• You could get a decent idea of the structural ensemble of small non-dynamic proteins (20-50,000 atoms) with a 1 ms simulation (see https://www.pnas.org/content/109/49/20006)• this will take three years on the best GPUs (1us/day)• If you have ANTON 2 https://www.psc.edu/anton2/ it will take you three

weeks (50us/day)…. https://www.psc.edu/resources/anton/anton-2-covid/

• For larger or dynamic proteins you need much more sampling (seconds?) and almost no one can access ANTON so what do you do?....Answer: Enhanced sampling

Metadynamics: Sampling free energy along a defined reaction coordinate using collective variables

• Adaptive force biasing and umbrella sampling are similar in spirit to Metadynamics.• Free energy wells are filled with computational sand

https://people.sissa.it/~laio/Research/Res_metadynamics.php:• Gaussian potentials are dropped where sampling has been carried out to discourage

going back to the same place and move out of free energy wells.• The size and shape of added potentials can be adjusted to sample the free energy along

the reaction coordinate more coarsely or finely.• Collective variables must have different values between the low energy states

and and transition states. • An example of a collective variable would be distance between center of mass of two

groups of atoms.• The hard part is figuring out the collective variable that distinguishes the important

states/transition states.• For more information see http://www.nature.com/articles/s42254-020-0153-0 and

https://www.ks.uiuc.edu/Research/namd/2.9/ug/node53.html

https://people.sissa.it/~laio/Research/Res_metadynamics.php

Metadynamics limitations

• It can be difficult to determine which parts of the molecule (or other aspects of the molecule) are important to include in the collective variable. • You need to know what you are looking for, good for testing specific

hypotheses.

Enhanced Sampling Methods (that do not use a reaction coordinate):

• Accelerated MD• Flattens the potential energy landscape to allow

quicker movement around the potential energy landscape (overcome kinetic barriers quickly). https://pubs.acs.org/doi/10.1021/jp062845o

• Gaussian accelerated molecular dynamics adds Gaussian shaped potentials to the native potential surface. This may allow more accurate recovery of the non-boosted (“real”) potential energy surface. http://pubs.acs.org/doi/10.1021/acs.jctc.5b00436

• There are many other techniques (e.g., Self-Guided Langevin dynamics, Nudged elastic band calculations)

https://www.ks.uiuc.edu/Research/namd/2.9/ug/node63.html

Accelerated MD The peaks are not changed.

Important: the potential added depends on the depth of energy well.

Limitations of Enhanced sampling

• Many techniques lose kinetic information (i.e., you get the structures, but not good info on how fast the molecule moves between those structures)• There are techniques to estimate kinetics, but large assumptions are made

and some people are not convinced this is reliable. see: • https://pubs.acs.org/doi/10.1021/ct1005399• https://pubs.acs.org/doi/10.1021/acs.jpcb.6b02654

• You can recover the native potential energy surface through “reweighting” the simulation. Reweighting is not trivial and often involves many assumptions.

Reweighting accelerated MD

• The potential energy surface can be estimated from the time (fraction of total frames) that the molecule spends in each conformation in a conventional MD using the Boltzmann relation.• This method can be used to construct the potential energy surface in

accelerated MD, but needs to be reweighted based on the boost potential used at each point on the surface.

Ni=number of molecules in conformation iN=total number of moleculesεi=energy of conformation ibottom part is the partition function (sum overall conformations).Ni/N=fraction of molecules below a certain energy

Replica Exchange: enhanced sampling without the need to reweight

• Carry out many simultaneous simulations (an ensemble of simulations) which differ in ways that make some of these simulations explore more conformational space. These differences can include:• Temperature• boosting in accelerated MD• altered force field (for example

decreased/scaled dihedral barrier height)

example V/8

Replica exchange involves an ensemble of simulations

• One simulation in the ensemble is “native” meaning it is at physiologic temperature, does not have any boosting or has the original force field. • Another simulation has the highest

temperature, boosting, or some modification to the force field.• Intermediate replicates have

modifications/temperatures somewhere between these two extremes (native and highest).• Flow of structures between replicas is

necessary to get enhanced sampling.

native Dihedral force constant(scaled by a factor of 1)

all Dihedral force constants set to 0

Dihedral force constant(scaled by a factor of 0.5)

http://dx.doi.org/10.1021/ct400862k

Overlap between ensemble members is necessary to allow flow of structures between replicas when using metropolis criterion.

Hamiltonian Replica Exchange

• Altering the temperature or the force field constants alter the Hamiltonian• The Hamiltonian is the total

energy of the system (Kinetic + Potential)• Kinetic ∝ temperature,

Potential energy results from atomic coordinates (conformation) in the context of the force field.

• Hamiltonian replica exchange refers to altering/scaling the force field constants

Why do we need to use metropolis criterion?

• Use of metropolis criterion is necessary so that we can consider all replicas as a single system obeying equilibrium thermodynamics, allowing us to ue the ”native” replica to form the potential energy surface without reweighting.

http://www.mmtsb.org/workshops/Lectures/REMD.pdf

Metropolis criterion

Coordinates (molecular structure) is “moved” between replicates• If the energy of the molecule is reasonable (according to metropolis

criterion) in the next replica down or up then it “moves” there.• This allows low energy conformations from highest simulations to filter down

to native simulations, enhancing sampling.• Move is in quotes above, because in reality the temperature/dihedral

constant scaling etc. is just adjusted up or down by scaling the momentum, (velocity) of the atoms or adjusting the dihedral scaling.

Ergodicity and Convergence

• Molecular dynamics simulations are considered Ergodic systems.• The average behavior observed of many molecules in a test tube is expected

to be well represented by the behavior of any one molecule, viewed over a long enough time period.

• Convergence (are we getting complete, and repeated sampling of all the accessible states of the molecule?)• Accessible states depend on the temperature and other conditions used in

the simulation • (Boltzmann's distribution)

https://www.youtube.com/watch?v=C3ptGf22Xiw

Why run multiplereplicas starting fromdifferent initial states.

• Reproducibility: If we start from different initial states (different crystal structures or different initial atomic velocities or using temperature and dihedral constant scaling replica exchange) do we get similar answers?

Avoiding False Positive Conclusions in Molecular Simulation: The Importance of ReplicasBernhard Knapp, Luis Ospina, and Charlotte M. DeaneJournal of Chemical Theory and Computation 2018 14 (12), 6127-6138DOI: 10.1021/acs.jctc.8b00391

http://dx.doi.org/10.1021/ct400862k

Multidimensional replica exchange.

• Allows increased sampling from high temperatures and scaleddihedrals (for instance). • Accelerated MD and temperature has also been used together in

multidimensional replica exchange.• see http://dx.doi.org/10.1021/jp4125099

Limitations of Replica Exchange

• You need large computational resources.• 10-300 GPUs used simultaneously in published replica exchange