I should start by saying "great job" with the newest release - thanks!
Now onto a question about the Substructure Match Counter node. I have been successfully using this to count-up eg Ns and Os with at least one H [#7&!H0, #8&!H0], but wanted to run some more 'Lipinski-like' bond counting. So what I opted for was to use the Hydrogen Adder node to make sure all of the implicit Hs were connected in the molecule graphs, then used [#7,#8]-[#1] as my query molecule.
However, the Substructure Match Counter only seems to count once, not twice, in the case of NH2 groups...
I am guessing this isn't the expected behaviour (but am also guessing it is down to how hydrogens get treated somewhere down the line!)?
For our substructure search algorithm there is no difference whether hydrogens are folded or unfolded. Because there is no difference between implicit and explicit hydrogens. This is why we don't support 'h' notation in SMARTS, and support only 'H' notation.
And for substructure search all pure hydrogens from the query molecule were ignored because they don't make sense. But for counting number of matches they make sense and is a bug in Indigo that we forgot about it. This will be fixed. As for now you can trick our substructure search algorithm and mask your hydrogen as a more complex expression. For example, you can specify [#7,#8]-[#1,#112] (if you don't have Copernicium in your molecules). In this case number of matches should be correct.
And you do not need to unfold hydrogens. There should be no difference in number of matches for structure with folded and unfolded hydrogens, because internally it is done automatically if necessary. (It would be good if you can verify this on your dataset).
So, we will fix it soon, and as temporary solution you can use [#7,#8]-[#1,#112] as a query.
This bug has been fixed in the nightlty builds version.
I understand that the period for fixing this bug was very long, but the problem might be still actual.