Skip to content

RDKIT Pickling Roundtrip Error (but only on MacOS?) #750

@apayne97

Description

@apayne97

Thanks to @ijpulidos and @chrisiacovella for finding and troubleshooting this.

2026.02.20_test_failure.ipynb

It manifests as the loss of the names in a LigandNetwork in the edges (but not the nodes?) such that edge.componentA.name and edge.componentB.name are ''.

This warning appears: "UserWarning: RDKit does not preserve Mol properties when pickled by default, which may drop e.g. atom charges; consider setting Chem.SetDefaultPickleProperties(Chem.PropertyPickleOptions.AllProps)

When we set this property in a few different places in our code, it fixes the bug but only for ubuntu/linux. On MacOS, the warning and error still appear.

The drugforge error is here:
https://github.com/choderalab/drugforge/actions/runs/22386574178/job/64798431670?pr=132

And you can see it passes on Ubuntu here:
https://github.com/choderalab/drugforge/actions/runs/22386574178/job/64798431676?pr=132#step:8:138

On the other hand, if you run this code on MacOS, it works just fine(below)

from rdkit import Chem
from rdkit.Chem import Descriptors
import pickle
Chem.SetDefaultPickleProperties(Chem.PropertyPickleOptions.AllProps)

smiles = "CN1C=NC2=C1C(=O)N(C(=O)N2C)C"
mol = Chem.MolFromSmiles(smiles)

mol.SetProp("_Name", "Caffeine")
mol.SetProp("Source", "Coffee Beans")
mol.SetProp("MolecularWeight", f"{Descriptors.MolWt(mol):.2f}")

# Pickle the molecule
pickled_mol = pickle.dumps(mol)

print(f"Molecule '{mol.GetProp('_Name')}' has been pickled.")
print(f"Pickle size: {len(pickled_mol)} bytes")

# Unpickle to demonstrate data persistence
new_mol = pickle.loads(pickled_mol)
print(f"\nRecovered Name: {new_mol.GetProp('_Name')}")
print(f"Recovered MW: {new_mol.GetProp('MolecularWeight')}")

This suggests there's a way that Chem.SetDefaultPickleProperties(Chem.PropertyPickleOptions.AllProps) is not persisting through the code but only for MacOS.

It's not clear to us exactly where the fix needs to be made (drugforge, openfe, or gufe). I've added a jupyter notebook that shows the error in our test where it fails on MacOS + a yaml file of the environment.
2026.02.20_test_failure.ipynb

20260225_macos.yaml

Proposed solution

It seems like this property needs to be set within gufe - I wonder if changing this to set the property instead of just throwing a warning would be better?

def __getstate__(self):

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions