Starling: Macroscopic pKa, logD, and Blood–Brain-Barrier Permeability
microscopic vs. macroscopic pKa; Uni-pKa and Starling; microstate ensembles; logD and Kp,uu predictions

Today, we’re excited to launch a macroscopic-pKa-prediction workflow on Rowan for our subscribing users! This workflow uses physics-informed machine learning to quickly predict pKa values, logD, and blood–brain-barrier penetrance for drug-like small molecules. In parallel, we’re releasing a preprint describing our methodology (link, ChemRxiv), which we’ll briefly describe in this newsletter.
Predicting pKa is something we’ve been thinking about for a while here at Rowan. Here’s what we wrote 14 months ago when we launched our AIMNet2-based pKa-prediction workflow, which was also Rowan’s very first workflow:
Understanding a molecule’s pKa is incredibly important: pKa values dictate whether a molecule will be ionized or neutral at a given pH, which can be used to predict solubility, membrane permeability, blood–brain-barrier penetration, hERG toxicity, phospholipidosis risk, and much more. Accurate pKa predictions help medicinal chemists design compounds with the desired physicochemical and pharmacological parameters, making pKa calculation a problem of “immense interest” in computational chemistry.
We’re happy with our previous pKa prediction, which is substantially different from and (we feel) complementary to the vast majority of DFT- or ML-based methods out there. It’s proven robust enough to be useful in a variety of contexts, like this excellent skeletal-editing work from the Levin group that we recently highlighted on X.
But this method isn’t right for every use case. It’s a microscopic pKa-prediction method, which makes comparing to experimental data difficult, and it’s also too slow to be convenient for large drug-like molecules. Recent work from Jonathan Zheng and co-workers also pointed out the danger of relying on microstate-only pKa-prediction methods, and prompted us to think about how we could provide our users with an alternative.
In their paper, Zheng and co-workers highlight the Uni-pKa model from DP Technology as a potential solution to the problems they describe (emphasis added):
Uni-pKa, published in 2024… accounts for tautomerism, capturing the microscopic pKa of both the uncharged and zwitterionic tautomers. To our knowledge, this is the only recently released ML model that correctly distinguishes between those microstates.
Uni-pKa works by enumerating all relevant microstates, predicting each microstate’s aqueous free energy, and generating pKa values from the differences between these free energies. The architecture and dataset are open-source, but the weights aren’t freely available. So we retrained our own lightweight Uni-pKa model, which we’re calling “Starling” to differentiate it from the original.
Starling excels at pKa prediction, just like the original Uni-pKa model (see our preprint for precise values). But we’re able to do a lot more with the output free-energy values than just predict pKa. We can predict isoelectric points and generate microstate populations as a function of pH, as shown here for glycine:
We can also generate pH-dependent logD predictions that match experimental data pretty nicely (in some cases) by matching up per-microstate logP predictions with our Starling microstate populations. Here’s what this looks like for pentachlorophenol, a case where we have experimental data to compare to—at low pH, the phenol is protonated and lipophilic, but as the pH increases the anion predominates and prefers the aqueous phase.
But we’re most excited about using Starling to predict blood–brain-barrier permeability. Previous work from Morgan Lawrenz and co-workers at Schrödinger showed that DFT-computed solvation energies were surprisingly predictive of Kp,uu, the unbound brain-to-plasma partition coefficient (and a “game-changing” metric for CNS therapeutics), but actually running all the DFT calculations can take days of high-performance computing time.
We’ve long hoped to create a fast version of this Kp,uu-prediction workflow using neural network potentials, but we’ve been stymied by the need for a fast and accurate macroscopic pKa predictor. Lawrenz and co-workers use a mix of experimental and DFT-computed pKa values to estimate the free-energy cost of neutralization at pH 7.4; now we’re able to use Starling to get the same correction, allowing us to build an accurate workflow that runs in minutes, not days. Here’s an ROC/AUC analysis for the task of predicting whether a compound will have Kp,uu above or below 0.3 (the same cutoff Lawrenz and co-workers use)—we get useful accuracy without any compound-specific fine-tuning.
We’re releasing our “macroscopic pKa” workflow today for all Rowan subscribers. This workflow predicts all of the properties described above—macroscopic pKa values, microstate populations, isoelectric points, logD values, and Kp,uu values—through a single interface that makes complex chemical phenomena extremely intuitive. Here’s an overview of adenine’s microstates, for instance:
If you’re interested in bringing these powerful capabilities to your drug-discovery organization, please reach out (contact@rowansci.com) and we’ll be happy to talk!