OpenMM

DEPRECATION WARNING

Use of pymatgen.io.openmm is no longer recommended. It has been superceded by workflows in atomate2. Once the tutorials are complete, someone ought to update this page :) Points of contact for MD questions: 1. Orion (leaving August 2024) 2. Nessa 3. Rohith

OpenMM

Our group has a robust infrastructure for setting up and running molecular MD simulations. The core of this infrastructure is in the Pymatgen OpenMM IO add-on. Once you have installed that package and its dependencies, you should be able to use the following code. For a quick install of the dependencies, execute these commands:

>>> conda install -c conda-forge parmed openmm openff-toolkit packmol ambertools=21 compilers openbabel openmmforcefields rdkit

First, you should import all of the functions and objects that we'll use below.

from pymatgen.io.openmm.generators import OpenMMSolutionGen
from pymatgen.io.openmm.sets import OpenMMSet
from pymatgen.io.openmm.simulations import equilibrate_pressure, anneal

from openmm.app import StateDataReporter, DCDReporter, PDBReporter

Generators

A pymatgen InputSet contains all of the files needed to set up and run a simulation. Writing those files by hand would be incredibly tedious, so we use a Generator to create the InputSet. The Generator is instantiated with simulation properties as input, like temperature and force field. Once it is created, it can generated many InputSets with different sets of molecules.

openmm_generator = OpenMMSolutionGen(
    force_field="Sage",
    temperature=298,
    step_size=0.001,
)

Input Set

An InputSet is created by calling generator.get_input_set() and providing the molecules (as smiles) and the density of the solution. When this happens, a ton of code is run behind the scenes that sets up the simulation.

molecules = {"O": 200, "CCO": 1}

input_set = openmm_generator.get_input_set(molecules, density=1) # get that input set!

We might not want to run our simulation immediately! For example, we might want to upload it to a supercomputer. We can easily do this by saving all of the files with input_set.write_input.

input_set.write_input(f"input_set_files") # write that input!

It is equally easy to load an InputSet from a directory.

input_set = OpenMMSet.from_directory("input_set_files")

Simulations

An InputSet can create a dynamic OpenMM Simulation object. This simulation object can be run and modified using the OpenMM software package.

simulation = input_set.get_simulation()

Once we have a simulation, we will first need to minimize the energy, otherwise the simulation box will explode. Once that's done, we can start the simulation!

simulation.minimizeEnergy() # see what happens if you comment out this line with a fresh simulation :)
simulation.step(1000)

Congratulations! You ran your first OpenMM simulation!

Reporters

By default, an OpenMM Simulation will not output any information. If we want data, we will need to add OpenMM Reporters to write out the trajectory and simulation state. In the next cell, I attach the most useful reporters, StateData, which writes state information, DCDReporter, which saves the trajectory, and PDBReporter, which saves the topology.

simulation = input_set.get_simulation() # first we will refresh our simulation

!mkdir output # first lets make a directory to save our files

# we actually only use the PDB reporter once, we will need the PDB later when we use MDAnalysis
pdb_reporter = PDBReporter("output/topology.pdb", 1)
pdb_reporter.report(simulation, simulation.context.getState(getPositions=True))

# the StateDataReporter and DCDReporter will write out every n_timesteps
n_timesteps = 200
simulation.reporters.append(StateDataReporter("output/state.txt", n_timesteps, step=True, potentialEnergy=True, temperature=True, volume=True, density=True))
simulation.reporters.append(DCDReporter(f"output/trajectory.dcd", n_timesteps))

200 is an extremely short time interval, in reality, you would want to use something like 1,000 for the StateDataReporter and 10,000 for the DCD reporter. Next we can run the simulation and see that it ran. We are using a short time interval here because this is just a demonstration.

Running a real simulation

A real molecular dynamics simulation is slightly more complex than the toy examples we've covered here. We first need to equilibrate the pressure of the simulation, which will change the volume until the simulation contents are at atmospheric pressure. Then we need to anneal the simulation by heating it to 400K to make sure we find a true equilibrium. Fortunately, the Pymatgen code has very convenient functions for doing both of these things. A real simulation would look something like below (though in reality we would run it for many more timesteps).

simulation.minimizeEnergy()
equilibrate_pressure(simulation, 10000)
anneal(simulation, 400, [2000, 2000, 2000])
simulation.step(10000)

Last updated