Pose-Analysis Molecular Dynamics and Non-Aqueous pKa
what to do after docking/co-folding; Rowan’s approach to short MD simulations; what’s next for SBDD and MD; new ML microscopic pKa models
Pose-Analysis Molecular Dynamics (MD)
We’re launching a new workflow for Rowan subscribers today, pose-analysis molecular dynamics.
What Comes After Docking / Co-Folding?
Last week, we wrote about our efforts to make “top-of-funnel” structure-based drug design (SBDD) methods like docking and co-folding more rigorous and dependable. (You can read our newsletter here if you missed it.)
Unfortunately, no matter how good we make docking or co-folding, these techniques are only one step in a mature virtual-screening pipeline. Most state-of-the-art virtual-screening pipelines start with quick methods like docking (or Boltz-2 affinity prediction), do one or more intermediate filtering steps, and then conclude with a “gold standard” method like free-energy perturbation.
There are a lot of choices for the intermediate filtering steps: some scientists recommend MD-based methods like MM/GBSA, while others recommend semiempirical quantum mechanics or machine-learned scoring functions. Unfortunately, a recent paper by François Sindt and co-workers showed that across a set of ten ultralarge virtual-screening datasets, none of these methods robustly worked to distinguish binders from non-binders.
We’ve been thinking about this area for a little while here at Rowan: our summer intern Ishaan worked with Corin to try and develop an NNP-based scoring method this summer, without notable success. After this initial disappointment, we returned to the literature. We were interested to see this paper from Hugo Guterres and Wonpil Im showing that simply analyzing the pose stability from MD could serve as an effective post-docking filter in many cases.
This approach has a few advantages. It’s simple and requires no binding-affinity-specific training data, reducing the problems of overfitting and data leakage seen with deep-learning-based binding-affinity-prediction methods. It’s also pretty fast—
Rowan’s Pose-Analysis MD Workflow
Rowan’s pose-analysis MD workflow uses OpenMM to run simulations of protein–ligand complexes. More specifically, here’s what we do (for people who care about MD methodology):
Starting from a refined pose, explicit waters are added to solvate the system (with an ionic strength of 0.10 M). The OpenForceField Sage 2.2.1 forcefield is used to parameterize the ligand, in conjunction with the Amber ff14SB forcefield for the protein and the TIP3P water model.
Protein backbone constraints are added for residues more than 7.0 Å away to prevent any unfolding. Hydrogen mass is increased to 1.5 Daltons and bonds to hydrogen are constrained, allowing OpenMM to take 2.0 fs timesteps. The particle-mesh Ewald algorithm is used to handle long-range interactions with a 8.0 Å cutoff distance.
Calculations are run in the NPT ensemble using a Langevin thermostat (1 ps timescale) and a Monte Carlo barostat. The temperature and pressure are 300 K and 1 atm by default.
After minimization and gradually heating the system to the desired temperature, a 1 ns equilibration simulation is run followed by 10 ns of the “production” simulation. 4 independent simulations are run by default.
All of these parameters are tunable through Rowan’s API, allowing scientists to fine-tune MD parameters for their system.
Rowan’s workflow returns:
A plot of ligand RMSD over time for each simulation.
A list of the conserved protein–ligand contacts, which can be useful in evaluating the strength of particular intermolecular interactions.
The ability to plot specified interatomic distances over time.
And, of course, the full MD trajectories.
How To Use Pose-Analysis Molecular Dynamics
We recommend using pose-analysis MD in one of two ways:
As discussed above, pose-analysis MD can be a useful filtering step after docking or co-folding. In this use case, all that matters is running a bunch of MD simulations and monitoring the RMSD over time.
Pose-analysis MD can also be a useful physics-based way to gain insight about individual interactions in an inhibitor (or substrate or intermediate, for an enzyme). Here, users will want to visualize the resulting contacts & trajectory and try to understand the binding mode in more detail.
We don’t recommend using pose-analysis MD to study protein conformational motion, as the timescales are not long enough (and the backbone is constrained).
What’s Next for SBDD in Rowan?
Longtime readers of the blog might notice that we’ve devoted a lot of time and attention to SBDD over the past few months. Without getting into details, we’re not finished yet—we’re working on integrating additional capabilities into Rowan to improve the virtual-screening funnel shown above, as well as new features to make it easier to orchestrate & deploy large numbers of jobs like what you’d want for production-scale virtual screening. We’re also planning to conduct end-to-end benchmarking to demonstrate which combinations of methods lead to effective binder enrichment on real datasets. Stay tuned for more updates!
Molecular Dynamics, More Broadly?
Since we launched, we’ve had scientists asking us if Rowan would offer molecular dynamics in the future. One of the reasons we’ve stayed away from MD until now is that “molecular dynamics” can encompass a lot of tasks, from the simple protein–ligand simulations described above to quasiclassical reactive ab initio MD and ion-diffusivity simulations in electrolytes.
Rowan’s general philosophy is to have workflows be specific, easy to test & benchmark, and modular. With pose analysis, we’ve taken a small but focused step into the big world of molecular dynamics. We’re interested in building out additional MD capabilities for our customers and would love to hear from anyone who’s interested.
Machine-Learned Microscopic pKa Models
Predicting pKa has been a special focus of Rowan’s for some time now—our first preprint (and workflow) described our use of AIMNet2 to predict microscopic pKa values in water, and several months ago we reported that a retrained version of Uni-pKa could be used to predict macroscopic pKa ensembles. Both of these methods only predict aqueous pKa values, though, and in response to user requests we’ve been contemplating adding a third pKa-prediction method to Rowan.
We’ve been following Jonathan Zheng’s work on pKa prediction with interest for some time, and so we were delighted to see Zheng collaborate with Thomas Nevolianis and a variety of other authors to create a set of high-quality microscopic-pKa-prediction models. These models work not only for oft-studied solvents like water and DMSO but also for more obscure pKa solvents like DMF, acetonitrile, and methanol.
We’ve now integrated these models into Rowan’s existing microscopic-pKa workflow for all users. Users can now select between “Rowan pKa” (our existing AIMNet2-based method) and “Chemprop” when submitting the workflow, and can additionally select the desired solvent if “Chemprop” is selected as the method.
Predictions complete in just seconds, and the output pKa values and microstates can be visualized in 2D form after the workflow is complete, as shown below. (The large nitrogen pKa values are typical for DMSO.) All existing validation and sanity checks for pKa predictions remain active when using this new method.
We expect that Chemprop-based pKa prediction will be best for “normal” drug-like organic molecules well-represented by the training data, while AIMNet2-based pKa prediction will be best for exotic chemical species (but is significantly slower). If you want to read more about the strengths and weaknesses of different pKa-prediction strategies, check out our recent blog post in which we explain the five main strategies that scientists use to predict pKa.
As always, we love hearing from scientists looking to apply the latest computational techniques to their work (whether in structure-based drug design, property optimization, molecular modeling, or elsewhere); please get in touch! (You can reach us at contact@rowansci.com.)












