Lhasa community contribution: metabolism

Lhasa Limited has provided the work by Patrik Rydberg et al. SmartCyp and WhichCyp via KNIME nodes. 

1) SMARTCyp: http://www.farma.ku.dk/smartcyp/ 

2) WhichCyp: http://www.farma.ku.dk/whichcyp/

Excerpt from SMARTCyp about:

"SMARTCyp is a method for prediction of which sites in a molecule that are most liable to metabolism by Cytochrome P450. It has been shown to be applicable to metabolism by the isoforms 1A2, 2A6, 2B6, 2C8, 2C19, 2E1, and 3A4 (CYP3A4), and specific models for the isoform 2C9 (CYP2C9) and isoform 2D6 (CYP2D6) are included from version 2.1. CYP3A4, CYP2D6, and CYP2C9 are the three of the most important enzymes in drug metabolism since they are involved in the metabolism of more than half of the drugs used today." ~ http://www.farma.ku.dk/smartcyp/about.php 

Excerpt from WhichCyp about:

"WhichCyp is a method for prediction of which Cytochrome P450 isoform(s) is(are) likely to bind a drug-like molecule." ~ http://www.farma.ku.dk/whichcyp/about.php


SMARTCyp takes an input table containing a molecule column (any type compatible with CDK) and this input is auto converted using the adapter cell functionality. SMARTCyp is then run (with your configuration) against the input table. Two output tables are provided: 


The first output gives the summary output showing a highlighted structure with rankings (PNG) and a CDK cell with atom highlights appended to the input table.


The second output table details the values for each atom on each input molecule (multiple rows per input row).


These nodes have been written for KNIME version 3. Some Java 8 functionality has been used so the nodes are not compatible with KNIME 2.x. 


Support is not provided for the underlying libraries but provided patches may be investigated. Support for the KNIME nodes is provided in the same approach as our other public nodes either via the forum or knime@lhasalimited.org

Changes made:

No updgrade to CDK 1.5.x as this has a signficiant impact of the results generated. The SASA (Solvernt Accessible Surface Area) algorithm produces different output (appears to rely on a side effect of the ordering of the atom types which changes in CDK 1.5) and the SMARTS matching differs resulting in a change in some energy assignments. 

- Run jarjar links on the smartcyp.jar and whichcyp.jar to rename the packages from org.openscience.* to org.openscience.old.* to avoid loading the wrong version of the CDK classes. If this is not performed CDK 1.5.x is loaded in some cases and CDK 1.4.x is others. 

- Creating the Input stream for the sybl-atom-types.owl changes to work in an OSGi environment

Impact of changes on results

Efforts have been taken to ensure that the results reproduce those generated by SMARTCyp 2.4.2 and WhichCyp 1.2. During this a bug was found in SMARTCyp which causes a miscalucation of the SASA value in symetric bicyclic motif containing structures. The issue remains in the node but this causes a small number of structures to generate a different output in terms of SASA value (rank remains the same). The same issue happened when running SMARTCyp outside of KNIME, equivelent atoms should have the SAME SASA value however in bicyclic systems this is sometimes calculated incorrectly, the atom with the incorrect calculation may differ from run to run.

A comparison of 6850 structures from drugbank were processed through SMARTCyp 2.4.2 and then through the node. The differing results change run to run; around 0.36% of the structures have a different output from SMARTCyp. The same issue is found when running the SMARTCyp jar multiple times and comparing output. The values generated here are reproducable to the same extent as the KNIME node.

Example issue:


Here, one of the four symmetry-related nitrogen atoms show a SASA of 9.122 where the others show 6.976. The specific nitrogen atom with this 9.122 value may differ between runs of SMARTCyp. Of the structures that don't match all contain a cage motif and all have the same Energy but different SASA (as the atom associated to the different value changes). As the SASA value is used in the score calculation this also changes. 

The results have been deemed reproducable to the level that SMARTCyp itself is when running the KNIME node. A subset of the SMARTCyp test compounds have been used (structures < 20 bonds), all 3847 structures produce the same results in the node as in the application. 


SmartCyp and WhichCyp are provided under the GNU LESSER GENERAL PUBLIC LICENSE. 

The KNIME nodes are provided under the GNU GENERAL PUBLIC LICENSE.



  1. L. Olsen, P. Rydberg, T. H. Rod, U. Ryde. Prediction of Activation Energies for Hydrogen Abstraction by Cytochrome P450 J. Med. Chem., 2006, 49, 6489-6499
  2. P. Rydberg, U. Ryde, L. Olsen. Prediction of Activation Energies for Aromatic Oxidation by Cytochrome P450 J. Phys. Chem. A, 2008, 112, 13058-13065
  3. P. Rydberg, P. Vasanthanathan, C. Oostenbrink, L. Olsen. Fast Prediction of Cytochrome P450 Mediated Drug Metabolism ChemMedChem, 2009, 4, 2070-2079
  4. P. Rydberg, D. E. Gloriam, J. Zaretzki, C. Breneman, L. Olsen. SMARTCyp: A 2D Method for Prediction of Cytochrome P450-Mediated Drug Metabolism ACS Med. Chem Lett., 2010, 1, 96-100
  5. P. Rydberg, U. Ryde, L. Olsen. Sulfoxide, Sulfur, and Nitrogen Oxidation and Dealkylation by Cytochrome P450 J. Chem. Theory Comput., 2008, 4, 1369-1377
  6. P. Rydberg, D.E. Gloriam, L. Olsen. The SMARTCyp cytochrome P450 metabolism prediction server Bioinformatics,2010, 26, 2988-2989
  7. P. Rydberg and L. Olsen. Ligand-based Site of Metabolism Prediction for Cytochrome P450 2D6 ACS Med. Chem Lett.,2012, 3, 69-73
  8. P. Rydberg and L. Olsen. Predicting Drug Metabolism by Cytochrome P450 2C9 - Comparison to the 2D6 and 3A4 Isoforms ChemMedChem, 2012, 7, 1202-1209
  9. P. Rydberg et al.,Nitrogen Inversion Barriers Affect the N-Oxidation of Tertiary Alkylamines by Cytochromes P450Angew. Chem, Int. Ed. 2013, 52, 993-997
  10. P. Rydberg et al., The Contribution of Atom Accessibility to Site of Metabolism Models for Cytochromes P450 Mol. Pharmaceutics 2013, 10, 1216-1223


  1. Michal Rostkowski, Ola Spjuth and Patrik Rydberg. WhichCyp: Prediction of Cytochromes P450 Inhibition, Bioinformatics,2013, 29, 2051-2052