Ion Mobility, Batch Docking, Strain, Flow-Matching Conformer Generation, and MSA
a diverse litany of new features: ion-mobility mass spectrometry; high-throughput docking with QVina; a standalone strain workflow; Lyrebird, a new conformer-generation model; and standalone MSAs
This week, we’re excited to launch a variety of new features on Rowan:
A workflow to predict collision cross sections from ion-mobility mass spectra (for subscribers)
A high-throughput docking workflow (for organizations)
A standalone strain workflow (for all users)
Lyrebird, a new flow-matching model for conformer generation (for all users, also released under an MIT license on GitHub)
And a standalone MSA workflow (for subscribers)
We know this is a lot, and we’ve tried to keep the newsletter relatively succinct—there are separate blog posts detailing most of these features, linked in their respective sections, which we advise interested readers to refer to.
Ion-Mobility Mass Spectrometry
Ion-mobility mass spectrometry is a type of mass spectrometry (MS) that’s often used in drug discovery and metabolomics—in addition to measuring mass-to-charge ratio, ion-mobility MS also measures the collision cross section (CCS) of different ions, making it useful for distinguishing various structural or stereochemical isomers in complex mixtures.
Unfortunately, predicting the CCS value of unseen molecules remains a challenging computational task—conventional physics-based methods require costly conformer optimization and scoring with DFT followed by trajectory simulations to determine the nitrogen cross section of a given molecular conformation. We’ve been thinking about how to address this problem for over a year, thanks to some early industry interest in CCS prediction, and we’re excited to be finally releasing our ion-mobility MS simulation to all subscribing Rowan users.
Our full approach is a bit too long to document in this newsletter, so we’re releasing an accompanying technical blog post detailing what we’ve done. Briefly, we use CREST and GFN2-xTB to generate and optimize various conformers, g-xTB to score the output conformations, and a modified version of the open-source CoSIMS trajectory-simulation code to compute per-conformer CCS values. Our final workflow can go from an input neutral analyte structure to a final predicted CCS value in only a few minutes, significantly faster than the tens of hours of simulation time required by state-of-the-art methods.
We’ve been fortunate to be developing this method in close collaboration with the Gair Group at Michigan State University. In real-world testing, MSU researchers have found that Rowan’s CCS predictions can be useful in quickly assigning isomeric mixtures that would otherwise require extensive isolation and characterization. Professor Joe Gair says:
Rowan has opened new research directions for our group. Comparing collisional cross sections calculated in Rowan versus those measured by ion-mobility mass spectrometry, we can assign structures to mixtures of diastereomers or regioisomers in an MS experiment that takes seconds.
While full details about how Professor Gair and co-workers are using Rowan’s CCS predictions to accelerate method development will have to wait until the publication of their paper, we’re happy to be able to share some early real-world validation. This is just one use case—if you’re interested in using Rowan’s CCS predictions to accelerate your chemical analysis workflows, please reach out! We’d love to do a pilot study to understand the value that Rowan can bring to your scientific area.
Batch Docking
As we’ve written before, docking—while flawed—is a workhorse for computer-assisted drug discovery. Rowan’s existing docking workflow is optimized for accuracy in hit-to-lead scenarios and takes between 20 seconds and 3 minutes per compound. We’ve published a blog post about how different docking settings impact docking runtimes previously. By default, Rowan’s docking workflow includes a pose-refinement optimization (which adds hydrogen atoms and optimizes their positions) and a strain correction—these increase the physical validity of docking but also require additional computational work.
While we’ve been quite happy with our docking workflow so far, it’s too slow for the large docking runs that are standard in early-staged virtual screening (a million compounds or more). To address this, we’re releasing a new batch-docking workflow that uses QVina2 (a faster implementation of AutoDock Vina) in combination with a lightweight process pool to maximize CPU utilization. For fragments or smaller lead molecules, Rowan’s batch-docking workflow takes about 2 seconds to go from SMILES string to final docking score on a 16-CPU machine, which translates to screening 10,000 compounds in 3 hours (on a single machine). Using multiple machines, it’s possible to run an entire million-compound virtual screen overnight.
For now, we’re deploying this batch-docking workflow to subscribing organizations only. If you’re interested in running high-throughput virtual screens through Rowan, please reach out!
Standalone Strain
When we first released our docking code, we also released an AIMNet2-based protocol to quantify the strain in the docked poses. Here’s what we wrote at the time:
Since neural network potentials (NNPs) have recently emerged as an appealing low-cost alternative to DFT for ranking conformer energies (and benchmark well at this task), we envisioned that we could use NNPs to generate strain energies for the entire ligand. Following Butler and co-workers, we optimize each pose with harmonic constraints tethering each atom to the original structure to prevent bizarrely high strain energies that result from minor differences in bond lengths or angles at different levels of theory. (OpenEye’s Freeform does something similar.) We then compute single-point energies at the AIMNet2/CPCMX(water) level of theory, and compare these to the energy of the lowest-energy conformer generated by our initial conformer search.
A number of users have expressed interest in predicting the strain of arbitrary conformations or crystallographic poses, so we’ve added a separate workflow to calculate the strain of an input pose using this same approach.
To compute the strain of a given conformation, select the “Strain Calculation” workflow and input the desired conformer.
Behind the scenes, Rowan will run a conformer search and compare the lowest energy conformer to the harmonically constrained input pose to obtain a strain value.
(Note: this workflow does not compute the strain of a molecule relative to a hypothetical unstrained isomer, but rather computes the strain of a conformation relative to that molecule’s least strained conformer. This workflow should not be used to compute ring-strain energies, for instance; refer to Chapter 3 of Bachrach’s Computational Organic Chemistry or similar texts for explanations of how to compute those values.)
Lyrebird
We’re also releasing Lyrebird, a flow-matching conformer-generation method based on the ET-Flow architecture. Lyrebird is our first foray into ML-based conformer generation; we added additional data to the standard GEOM-DRUGS training set to ensure that the final model would maintain good performance across diverse regions of chemical space, including macrocycles and macrocyclic peptides (which are generally quite challenging for conformer-generation methods). In our hands, Lyrebird outperforms the standard ETKDG method from RDKit on a variety of benchmarks, although it’s not well-suited for all scenarios and still struggles where data is sparse: check out our blog post for more details and data.
All Rowan users can run Lyrebird through the standard conformer-search workflow. Simply select mode “Manual” and choose “Lyrebird” as the conformer-generation method.
We advise users to treat Lyrebird as a research preview; while we’re happy with its performance thus far, we haven’t exhaustively benchmarked it, and we recommend that users conduct their own sanity checks and benchmarks before relying on Lyrebird-predicted results.
We’re also releasing the Lyrebird weights on GitHub under an MIT license, making it simple for other groups to run their own benchmarks or incorporate Lyrebird into their local workflows, as well as open-sourcing our new MPCONF196GEN benchmark set (also under an MIT license).
Co-Folding Updates
We’ve added support for cyclic peptides to our implementation of Boltz-2.
With the above inputs, Boltz-2 generates the below peptide–synthase complex.
You can now also predict the structures of DNA- and RNA-containing complexes through our co-folding workflow using Boltz-2 and Chai-1.
Astute observers might note that these visualizations (1) don’t display cyclic peptides as cyclic and (2) don’t yet show nucleobases in their customary form. Better cyclic peptide and nucleic acid visualization options will follow shortly!
Standalone MSA
Co-folding models are here to stay. At Rowan, we’ve seen users run thousands of co-folding jobs with no signs of slowing down. One core component of AlphaFold-style models is their use of multiple sequence alignment (MSA) data. MSAs are generated by running compute-heavy queries against databases of known protein sequences. This process is resource-intensive, and many researchers rely on public servers for this step of co-folding.
To make sure our users’ data never leaves our environment, we host our own MSA server. And while most Rowan users run their entire co-folding process end-to-end through Rowan’s interface, we’ve had a number of users ask us for the ability to use Rowan-generated MSAs in their own custom workflows.
Today we’re launching an MSA workflow for Rowan subscribers that runs MSA queries and formats the resulting data for subsequent use with different models. By default, the workflow generates a .a3m file, and we’ve added support for the formats used by Chai-1, Boltz-1, and Boltz-2. We’ve also written a variety of example scripts showing how Rowan-generated MSAs can be used with co-folding models, available here.
As always, we love to hear from users working on interesting scientific problems or companies looking for computational support—so don’t hesitate to reach out. Until next time, happy computing!













