You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Make create_basic_dataset reuse the CMILES from an OptimizationDataset (if present) to avoid issues where cheminformatics toolkit behavior has changed
#310
there's no guarantee that running a to_smiles will give you the same cmiles for the same molecule across different OpenEye/RDKit versions, or especially if you have one toolkit installed but another was used to generate the source datasets. It would be slightly more robust to use the exact cmiles in the dataset result
Basically, the new version of create_basic_dataset (and likely other parts of qcsubmit) reconstructs a CMILES for each Molecule instead of reusing the CMILES on the OptimizationRecord. I don't think this causes any (known) functional issues, but it could lead to situations where two "identical" datasets have different CMILES.
This call to to_smiles is the root of the issue in this case, and could be replaced by some kind of dict lookup mapping record_id to cmiles extracted from self.entries earlier in the function.
j-wags
changed the title
Possible CMILES changes between toolkit/backend versions
Make create_basic_dataset reuse the CMILES from an OptimizationDataset (if present) to avoid issues where cheminformatics toolkit behavior has changed
Jan 9, 2025
As Lily pointed out here:
Basically, the new version of
create_basic_dataset
(and likely other parts of qcsubmit) reconstructs a CMILES for eachMolecule
instead of reusing the CMILES on theOptimizationRecord
. I don't think this causes any (known) functional issues, but it could lead to situations where two "identical" datasets have different CMILES.This call to
to_smiles
is the root of the issue in this case, and could be replaced by some kind of dict lookup mappingrecord_id
tocmiles
extracted fromself.entries
earlier in the function.openff-qcsubmit/openff/qcsubmit/results/results.py
Lines 706 to 715 in d4e6b69
The text was updated successfully, but these errors were encountered: