While playing around with your MMPA nodes I ran into an issue with reverse-direction transformations showing up despite the fact that I have deselected this option on the Output Settings tab of the Fragments to MMPs node.
Attached a workflow with 1000 hERG ligands from chEMBLdb as input, so that you can check my settings.
When using the default settings, i.e. with 'Show reverse-direction transforms' ticked everything looks fine, I get equal numbers of transformations and the reverse ones with the same statistics. The top-ranked transformation is F-A to A-H with 16 counts. However, when unticking this option I get unexpected results. The top-ranked transformation is still F-A to A-H (or rather, the reverse transformation), now with 12 counts, but as you can see further down the reverse transformation occurs 4 times, together totalling 16. The same is the case for other transformations, e.g. A-H to A-Me, with 'Show reverse-direction transforms' ticked we get 14 counts of this transformation both ways, but with the reverse-transform option unticked I get 10 and 4 counts of either transformation.
Interestingly I have reported similar behaviour with the Automated Matched Pairs node form the Erlwood community contributions: https://www.knime.com/forum/cheminformatics/automated-matched-pairs-nod…
Could you try to reproduce my results? Is this the expected behaviour? Am I overlooking something?
Sorry, I've only just found this post, and will not be able to look at it for another week or so. I will investigate and get back to you.
OK, I'm pretty sure I now understand what is happening here. It's not a bug as such, and I think I would be inclined to always 'show reverse transforms', at which point you don't see the issue for hopefully obvious reasons. However, I would say that it's not entirely desirable either - I would certainly expect the transforms to be canonical, so that *-H <--> *-F for example always occurs the same way round in the 'forwards' direction. (Indeed I think the old, slow version of the Fragments to MMPs node did do this, but lost the feature as part of the major rewrite in the current version)
To explain why this happens, a little background on how the node works is required. Unless the 'keys are sorted' option is checked, the node presorts the incoming table by the key column, and then processes each chunk of rows with the same key together. Every pair of rows in that chunk forms a matched molecular pair between their two 'values' What is currently happening is that the pair is represented in the 'forwards' direction in whatever order the two values occur - sometimes that is *-H before *-F, and sometimes that is *-F before *-H. As such, it shouldn't, I don't think, be too difficult to fix - if the pre-sorting phase also sorts the values column then that should resolve the situation.
Indeed, to check this situation, I filtered the original fragmented molecules table by the key's used to generate those *-H/*-F interconversions in your example workflow, and sorted the resulting table by key - sure enough, there are 12 keys where *-F precedes *-H in the table, and 4 where *-H precedes *-F.
I will look into getting this sorted in the next update - thanks for the report and the example - that really helps to track things like this down!
Just to confirm, we have now fixed this in version 1.12.3 of our plugin. Unfortunately, we are currently waiting on a fix to the RDKit plugin, on which we are dependant, which is preventing us releasing this update and a number of other minor bug fixes.
As a temporary workaround, if you insert a sorter node between the fragmentation and pair generation, and select the first column as the fragment 'key' column, and the second as the fragment 'value' column, and then select the 'Keys are sorted' option in the Fragments to MMPs node, that should give you the results you expect.
I cn confirm that the workaround gives the desired results.
Many thanks for your pointers,
Just to confirm this is now fixed in our 3.5 stable version (the nightly build will follow in the next week or two)
You need version 1.12.7 of our plugins