I am new to the forum but I would need to ask your help to solve a problem.
I am looking for a method to match structures from different files:
- A SMILES list of reactants
- A SMILES list fo products
The reactants and the products structures are very similar, they were tranformed from epoxides to alcohols, but the number of reactants is minor that the number of products because from 1 epoxide you almost always get 2 products.
I would like to match and join them into the same row, to create later a RXN file.
It should be fairly straightforward to acheive this. Assuming that your list of Smiles is contained in a csv file, you can use the CSV reader to read them in. Then use the molecule typecast to convert from a string column to a smiles column.
You will need a criteria that defines when a reactant is related to a product. I guess that you could use a fingerprint to compare the similarity (e.g. tanimoto) of the reactants with each of the products. For those highly similar compounds you would be able to join the rows.
Thanks for your reply. In this case, all the molecules are very similar to each other, so a similarity approach won't work, but I found a way to make it working through a bit of setup in the molecule generator.