Compute the alanine dipeptide [\(\Phi,\Psi\)] transition using OpenMM and OPES

In this brief tutorial, we calculate the Ramachandran [\(\Phi\),\(\Psi\)] plot of alanine dipeptide in vacuum using the OPES method.

Alanine dipeptide is a popular test system for enhanced sampling algorithms because it is a minimal example of sampling challenges that are posed by many biological systems. The slow motions of the molecule are largely governed by the \(\Psi\) (backbone N-C-C-N) and \(\Phi\) (backbone C-N-C-C) dihedrals.

An interactive visualization of the alanine dipeptide molecule can be shown with nglview. Atoms involved in the [\(\Phi\),\(\Psi\)] collective variable (CV) are shown in ball and stick representation, while other atoms are transparent.

[2]:
import pytraj as pt
import nglview as ngl

traj = pt.load("../data/alanine-dipeptide.pdb", top='../data/alanine-dipeptide.prmtop')

view = ngl.show_pytraj(traj)
view.clear_representations()
view.add_ball_and_stick('@6,8,14,16', opacity=1.0) # CV atoms of phi torsion
view.add_ball_and_stick('@4,6,8,14', opacity=1.0)  # CV atoms of psi torsion
#view.add_ball_and_stick('@1,4,6,8', opacity=1.0)   # CV atoms of theta torsion

view.add_licorice(opacity=0.5)

# uncomment for interactive visualization in a notebook
#view

While the interactive visualization is not possible on the web page, this is what you should see in a notebook:

image1

Importance sampling of the [\(\Phi,\Psi\)] plane with OPES

Below, the OPES sampling algorithm is applied to enhance sampling in the [\(\Phi,\Psi\)] plane. For this purpose, an OpenMM simulation object is created using AMBER parameter and coordinate files and the OpenMM interface to the adaptive sampling package.

[3]:
from sys import stdout
import numpy as np

from openmm import *
from openmm.app import *
from openmm.unit import *

from adaptive_sampling.interface.interface_openmm import AdaptiveSamplingOpenMM

# ------------------------------------------------------------------------------------
# define collective variables
cv_atoms_psi    = [6, 8, 14, 16]  # backbone N-C-C-N torsion
cv_atoms_phi    = [4, 6, 8, 14]   # backbone C-N-C-C torsion
minimum         = -180.0          # minimum of the CV
maximum         = 180.0           # maximum of the CV
bin_width       = 5.0             # bin width along the CV
periodicity     = [               # define periodicity of CVs (needs to be given in radians as units are not converted)
    [-np.pi, np.pi],
    [-np.pi, np.pi],
]

collective_var = [
    ["torsion", cv_atoms_psi, minimum, maximum, bin_width],
    ["torsion", cv_atoms_phi, minimum, maximum, bin_width],
]

# ------------------------------------------------------------------------------------
# Setup OpenMM
prmtop = AmberPrmtopFile(f"../data/alanine-dipeptide.prmtop")
crd = AmberInpcrdFile(f"../data/alanine-dipeptide.crd")
system = prmtop.createSystem(
    nonbondedMethod=NoCutoff,
    constraints=HBonds,
)

# Initialize the `AdaptiveSamplingOpenMM` interface to couple the OpenMM simulation to a bias potential
# The OpenMM `simulation` object is set up internally, but can still be modified by calling `the_md.simulation` or `the_md.integrator`
the_md = AdaptiveSamplingOpenMM(
    crd.positions,
    prmtop.topology,
    system,
    dt=2.0,                                       # timestep in fs
    equil_temp=300.0,                             # temperature of simulation
    langevin_damping=1.0,                         # langevin damping in 1/ps
    cv_atoms=np.unique(cv_atoms_phi+cv_atoms_psi) # specifying CV atoms significantly speeds up simulation of large systems, as the bias force will only be calculated for those
)
the_md.integrator.setConstraintTolerance(0.00001)

# Append OpenMM reporters to simulation for output
the_md.simulation.reporters.append(DCDReporter('alanine-dipeptide.dcd', 100))
the_md.simulation.reporters.append(StateDataReporter(
    stdout,
    100,
    step=True,
    time=True,
    potentialEnergy=True,
    kineticEnergy=True,
    totalEnergy=True,
    temperature=True,
    speed=False,
    separator='\t')
)

OpenMM then is provided with an instance of the OPES class as the bias. Although the OPES implementation let the user toggle between different variants of various features, the defaults for the corresponding settings have turned out to be sufficient for most cases. This minimizes the user input to a small set of parameters. Below is an example of an OPES instance that relies on this implementation’s defaults for efficient computation but provides the possibility to take a deeper look into OPES’ behaviour for the parameters, that can directly affect the simulations quality.

  • Compare the results when providing OPES with a small initial opes_kernel_std in comparison to automatically estimating it to fit the starting basin.

  • Compare the effects of different barriere heights opes_barrier and how limiting the OPES bias to smaller barriers than the system’s one.

[4]:
from adaptive_sampling.sampling_tools.opes import OPES
# --------------------------------------------------------------------------------------
# Setup the sampling algorithm

opes_kernel_std   = None          # kernel standard deviation
opes_frequency    = 500           # frequency of kernel creation in MD steps
opes_barrier      = 50.0          # Barrier parameter in kJ/mol

the_bias = OPES(
    the_md,
    collective_var,               # collective variable
    # OPES parameters
    kernel_std   = opes_kernel_std,
    update_freq  = opes_frequency,
    energy_barr  = opes_barrier,

    # general parameters
    output_freq  = 1000,             # frequency of writing outputs
    f_conf       = 0.0,               # confinement force of CV at boundaries
    equil_temp   = 300.0,             # equilibrium temperature of simulation
    periodicity  = periodicity,      # periodicity of CVs
    verbose      = True,                 # print verbose output
)
the_md.set_sampling_algorithm(the_bias) # to take effect the sampling algorithm has to be set in the MD interface

 >>> INFO: Initialize torsion as collective variable:
         Minimum0:      -180.0 Degree
         Maximum0:      180.0 Degree
         Bin width0:    5.0 Degree

 >>> INFO: Initialize torsion as collective variable:
         Minimum1:      -180.0 Degree
         Maximum1:      180.0 Degree
         Bin width1:    5.0 Degree
        ----------------------------------------------
         Total number of bins:          5184

 >>> INFO: OPES Parameters:
         ---------------------------------------------
         Kernel_std:    estimate from 5000 steps
         Rescaling:     True
         Adaptive:      False   (5000 steps)
         Normalize:     True    (approximated: True)
         Explore:       False
         Barrier:       11.9503 kcal/mol
         Bias factor:   20.045403709021212
         Read force:    True
         Kernel merge:  True    (threshold: 1.0)
         ---------------------------------------------
>>> adaptive-sampling: Module 'ase' not found, will not import 'FENEB'
/home/robert/Bachelor_Thesis/Code/adaptive_sampling/adaptive_sampling/colvars/colvars.py:215: UserWarning: Using torch.cross without specifying the dim arg is deprecated.
Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at /opt/conda/conda-bld/pytorch_1720538438750/work/aten/src/ATen/native/Cross.cpp:62.)
  self.cv = torch.atan2(torch.dot(torch.cross(q23_u, n1), n2), torch.dot(n1, n2))
[8]:
# WARNING: For long simulations, this can become expensive, and it is recommended to perform the computation on an HPC cluster.
if os.path.isfile('CV_traj.dat'):
    os.system("rm CV_traj.dat opeseabf.out")
the_md.run(nsteps=1000) # 500000 * 2 fs = 1 ns
1100    2.1999999999999793      -77.484027451166        70.72776178943762       -6.756265661728378      333.5919402869253
1200    2.3999999999999573      -63.594587427905225     51.409461789394186      -12.185125638511039     242.4759623878226
1300    2.5999999999999353      -70.42487310290143      53.21313727800854       -17.21173582489289      250.9831113583515
1400    2.7999999999999132      -52.20514107819585      50.758430490201945      -1.4467105879939055     239.40533228741566
1500    2.999999999999891       -52.521335364644656     55.41707830529658       2.895742940651921       261.3781379358894
1600    3.199999999999869       -49.08489702289795      62.4755462465003        13.390649223602352      294.66984626067983
1700    3.399999999999847       -46.31697749196262      59.84464743883469       13.527669946872074      282.26104643805064
1800    3.599999999999825       -55.343389270210565     71.54269934628653       16.19931007607596       337.43564456829716
1900    3.799999999999803       -59.52420207082332      75.13378924961816       15.60958717879484       354.37324613080114
2000    3.999999999999781       -60.86357611569707      48.45052394606737       -12.413052169629701     228.51994580578491

Analysis of Results

Visualize the trajectory with NGlView

The following will create an interactive visualization of the trajectory:

[10]:
path = '.'
traj = pt.iterload(f"{path}/alanine-dipeptide.dcd", top="../data/alanine-dipeptide.pdb")

view = ngl.show_pytraj(traj)
view.clear_representations()
view.add_ball_and_stick('@6,8,14,16', opacity=1.0) # CV atoms of phi torsion
view.add_ball_and_stick('@4,6,8,14', opacity=1.0)  # CV atoms of psi torsion
#view.add_ball_and_stick('@1,4,6,8', opacity=1.0)   # CV atoms of theta torsion

view.add_licorice(opacity=0.5)
#view

Trajectory of CVs

All important information for post/processing of the trajectory is stored in CV_traj.dat.

[11]:
cv_traj = np.loadtxt(f'{path}/CV_traj.dat', skiprows=1)
cv_phi = cv_traj[:,1] # Phi trajectory (first CV)
cv_psi = cv_traj[:,2] # Psi trajectory (second CV))
[ ]: