Compute the alanine dipeptide [\(\Phi,\Psi\)] transition using OpenMM and OPES
In this brief tutorial, we calculate the Ramachandran [\(\Phi\),\(\Psi\)] plot of alanine dipeptide in vacuum using the OPES method.
Alanine dipeptide is a popular test system for enhanced sampling algorithms because it is a minimal example of sampling challenges that are posed by many biological systems. The slow motions of the molecule are largely governed by the \(\Psi\) (backbone N-C-C-N) and \(\Phi\) (backbone C-N-C-C) dihedrals.
An interactive visualization of the alanine dipeptide molecule can be shown with nglview. Atoms involved in the [\(\Phi\),\(\Psi\)] collective variable (CV) are shown in ball and stick representation, while other atoms are transparent.
[2]:
import pytraj as pt
import nglview as ngl
traj = pt.load("../data/alanine-dipeptide.pdb", top='../data/alanine-dipeptide.prmtop')
view = ngl.show_pytraj(traj)
view.clear_representations()
view.add_ball_and_stick('@6,8,14,16', opacity=1.0) # CV atoms of phi torsion
view.add_ball_and_stick('@4,6,8,14', opacity=1.0) # CV atoms of psi torsion
#view.add_ball_and_stick('@1,4,6,8', opacity=1.0) # CV atoms of theta torsion
view.add_licorice(opacity=0.5)
# uncomment for interactive visualization in a notebook
#view
While the interactive visualization is not possible on the web page, this is what you should see in a notebook:
Importance sampling of the [\(\Phi,\Psi\)] plane with OPES
Below, the OPES sampling algorithm is applied to enhance sampling in the [\(\Phi,\Psi\)] plane. For this purpose, an OpenMM simulation object is created using AMBER parameter and coordinate files and the OpenMM interface to the adaptive sampling package.
[3]:
from sys import stdout
import numpy as np
from openmm import *
from openmm.app import *
from openmm.unit import *
from adaptive_sampling.interface.interface_openmm import AdaptiveSamplingOpenMM
# ------------------------------------------------------------------------------------
# define collective variables
cv_atoms_psi = [6, 8, 14, 16] # backbone N-C-C-N torsion
cv_atoms_phi = [4, 6, 8, 14] # backbone C-N-C-C torsion
minimum = -180.0 # minimum of the CV
maximum = 180.0 # maximum of the CV
bin_width = 5.0 # bin width along the CV
periodicity = [ # define periodicity of CVs (needs to be given in radians as units are not converted)
[-np.pi, np.pi],
[-np.pi, np.pi],
]
collective_var = [
["torsion", cv_atoms_psi, minimum, maximum, bin_width],
["torsion", cv_atoms_phi, minimum, maximum, bin_width],
]
# ------------------------------------------------------------------------------------
# Setup OpenMM
prmtop = AmberPrmtopFile(f"../data/alanine-dipeptide.prmtop")
crd = AmberInpcrdFile(f"../data/alanine-dipeptide.crd")
system = prmtop.createSystem(
nonbondedMethod=NoCutoff,
constraints=HBonds,
)
# Initialize the `AdaptiveSamplingOpenMM` interface to couple the OpenMM simulation to a bias potential
# The OpenMM `simulation` object is set up internally, but can still be modified by calling `the_md.simulation` or `the_md.integrator`
the_md = AdaptiveSamplingOpenMM(
crd.positions,
prmtop.topology,
system,
dt=2.0, # timestep in fs
equil_temp=300.0, # temperature of simulation
langevin_damping=1.0, # langevin damping in 1/ps
cv_atoms=np.unique(cv_atoms_phi+cv_atoms_psi) # specifying CV atoms significantly speeds up simulation of large systems, as the bias force will only be calculated for those
)
the_md.integrator.setConstraintTolerance(0.00001)
# Append OpenMM reporters to simulation for output
the_md.simulation.reporters.append(DCDReporter('alanine-dipeptide.dcd', 100))
the_md.simulation.reporters.append(StateDataReporter(
stdout,
100,
step=True,
time=True,
potentialEnergy=True,
kineticEnergy=True,
totalEnergy=True,
temperature=True,
speed=False,
separator='\t')
)
OpenMM then is provided with an instance of the OPES
class as the bias. Although the OPES implementation let the user toggle between different variants of various features, the defaults for the corresponding settings have turned out to be sufficient for most cases. This minimizes the user input to a small set of parameters. Below is an example of an OPES instance that relies on this implementation’s defaults for efficient computation but provides the possibility to take a deeper look into
OPES’ behaviour for the parameters, that can directly affect the simulations quality.
Compare the results when providing OPES with a small initial
opes_kernel_std
in comparison to automatically estimating it to fit the starting basin.Compare the effects of different barriere heights
opes_barrier
and how limiting the OPES bias to smaller barriers than the system’s one.
[4]:
from adaptive_sampling.sampling_tools.opes import OPES
# --------------------------------------------------------------------------------------
# Setup the sampling algorithm
opes_kernel_std = None # kernel standard deviation
opes_frequency = 500 # frequency of kernel creation in MD steps
opes_barrier = 50.0 # Barrier parameter in kJ/mol
the_bias = OPES(
the_md,
collective_var, # collective variable
# OPES parameters
kernel_std = opes_kernel_std,
update_freq = opes_frequency,
energy_barr = opes_barrier,
# general parameters
output_freq = 1000, # frequency of writing outputs
f_conf = 0.0, # confinement force of CV at boundaries
equil_temp = 300.0, # equilibrium temperature of simulation
periodicity = periodicity, # periodicity of CVs
verbose = True, # print verbose output
)
the_md.set_sampling_algorithm(the_bias) # to take effect the sampling algorithm has to be set in the MD interface
>>> INFO: Initialize torsion as collective variable:
Minimum0: -180.0 Degree
Maximum0: 180.0 Degree
Bin width0: 5.0 Degree
>>> INFO: Initialize torsion as collective variable:
Minimum1: -180.0 Degree
Maximum1: 180.0 Degree
Bin width1: 5.0 Degree
----------------------------------------------
Total number of bins: 5184
>>> INFO: OPES Parameters:
---------------------------------------------
Kernel_std: estimate from 5000 steps
Rescaling: True
Adaptive: False (5000 steps)
Normalize: True (approximated: True)
Explore: False
Barrier: 11.9503 kcal/mol
Bias factor: 20.045403709021212
Read force: True
Kernel merge: True (threshold: 1.0)
---------------------------------------------
>>> adaptive-sampling: Module 'ase' not found, will not import 'FENEB'
/home/robert/Bachelor_Thesis/Code/adaptive_sampling/adaptive_sampling/colvars/colvars.py:215: UserWarning: Using torch.cross without specifying the dim arg is deprecated.
Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at /opt/conda/conda-bld/pytorch_1720538438750/work/aten/src/ATen/native/Cross.cpp:62.)
self.cv = torch.atan2(torch.dot(torch.cross(q23_u, n1), n2), torch.dot(n1, n2))
[8]:
# WARNING: For long simulations, this can become expensive, and it is recommended to perform the computation on an HPC cluster.
if os.path.isfile('CV_traj.dat'):
os.system("rm CV_traj.dat opeseabf.out")
the_md.run(nsteps=1000) # 500000 * 2 fs = 1 ns
1100 2.1999999999999793 -77.484027451166 70.72776178943762 -6.756265661728378 333.5919402869253
1200 2.3999999999999573 -63.594587427905225 51.409461789394186 -12.185125638511039 242.4759623878226
1300 2.5999999999999353 -70.42487310290143 53.21313727800854 -17.21173582489289 250.9831113583515
1400 2.7999999999999132 -52.20514107819585 50.758430490201945 -1.4467105879939055 239.40533228741566
1500 2.999999999999891 -52.521335364644656 55.41707830529658 2.895742940651921 261.3781379358894
1600 3.199999999999869 -49.08489702289795 62.4755462465003 13.390649223602352 294.66984626067983
1700 3.399999999999847 -46.31697749196262 59.84464743883469 13.527669946872074 282.26104643805064
1800 3.599999999999825 -55.343389270210565 71.54269934628653 16.19931007607596 337.43564456829716
1900 3.799999999999803 -59.52420207082332 75.13378924961816 15.60958717879484 354.37324613080114
2000 3.999999999999781 -60.86357611569707 48.45052394606737 -12.413052169629701 228.51994580578491
Analysis of Results
Visualize the trajectory with NGlView
The following will create an interactive visualization of the trajectory:
[10]:
path = '.'
traj = pt.iterload(f"{path}/alanine-dipeptide.dcd", top="../data/alanine-dipeptide.pdb")
view = ngl.show_pytraj(traj)
view.clear_representations()
view.add_ball_and_stick('@6,8,14,16', opacity=1.0) # CV atoms of phi torsion
view.add_ball_and_stick('@4,6,8,14', opacity=1.0) # CV atoms of psi torsion
#view.add_ball_and_stick('@1,4,6,8', opacity=1.0) # CV atoms of theta torsion
view.add_licorice(opacity=0.5)
#view
Trajectory of CVs
All important information for post/processing of the trajectory is stored in CV_traj.dat
.
[11]:
cv_traj = np.loadtxt(f'{path}/CV_traj.dat', skiprows=1)
cv_phi = cv_traj[:,1] # Phi trajectory (first CV)
cv_psi = cv_traj[:,2] # Psi trajectory (second CV))
[ ]: