There is a new KNIME forum. You can still browse and read content from our old forum but if you want to create new posts or join ongoing discussions, please visit our new KNIME forum:

Chemical library overlap

Member for

3 years 1 month pv_srinivasan

Hi everyone

Newbie here! I have 2 sdf files (37K and 193K unique entries respectively). I would like to figure out the molecular overlap between the two libraries to come up with a common chemical set that encompasses all of the underlying molecular diversities.

Could someone help/guide?

Best, Sri

Mon, 11/13/2017 - 01:14

Member for

4 years 10 months


Here's one way to do it - you will need the Chemistry Addons and RDKit Community nodes installed for this solution.

Take 2 SDF Reader nodes and read each io the SD files you want to compare into one of them.

Hook up an RDKit CANON Smiles node to the outputs of the SDF Reader nodes and run with default settings.

Join the output to either input of the Reference Row Filter node, and configure it to use the Canonical (Molecule) columns from both tables for filtering. This should leave you with a table containing only the molecules present in both SD files.

You may need to put the RDKit Salt Stripper node before each RDKit CANON Smiles node, in case your SD files contain counter ions.


Wed, 12/06/2017 - 03:48

Member for

3 years 10 months


Hi Evert

I did as you mentioned, any comments on this?



Tue, 12/12/2017 - 01:53

Member for

4 years 10 months


Hey jayadeepa,
I think what Evert meant was to connect the first CANON node to the first input of the Reference Row Filter, and the second CANON node to the second input, not using two Reference Row Filters. ;-)