The 2nd KNIME Cheminformatics Workshop was hosted by Vernalis in Cambridge on September 29th 2014. Thanks to Steve R and James L for taking notes.
Items from last workshop & What's cooking in KNIME
- See Thorsten's presentation for an overview
Dave Morley brought this up, and most other users agreed, that a "real" versioning mechanism (GitHub, Mercurial, SVN etc) was needed
- Various users reported having investigated using SVN or Mercurial in KNIME
- Thorsten explained that the KNIME explorer view does not show the Eclipse Project explorer "add-ons" – e.g. commit status
- Need an Import from … repository option under import workflows
Internal Webservice access
- Lilly and Vernalis have independently implemented generic frameworks to access their internal webservices
- Different companies have different architectures, so probably these will remain bespoke
- James L / Steve R may be able to liaise around shared issues
- Identified as important in April meeting
- Both Vernalis and Lilly working on new MMP nodes, with a view to release. Complementarity between functionality of the nodes (Vernalis – multi-cut, no data columns, split fragmentation and MMP generation)
- Lilly nodes currently in legal approval
- Vernalis in advanced development, and nearly release-ready, but again require approval
- Steve R / James L should liaise to ensure overlap is avoided
- See Steve's presentation
- See Matched Pairs
- Timed loops (run for / to)
- Pause nodes (wait for / to)
- Interest in Vernalis Timed loop / delay nodes – We will seek permission to release asap
- Ertl Scaffold keys (in RDKit)
- Co-ordinate manipulations in RDKit (Rotate about axes, align to principal axes etc)
- Read/Write Variables nodes
Sam W described some of the internal Lhassa Limited nodes, and interest in a number of them was expressed.
- Multi-column row filter node
- BitSet manipulation (see below)
- Many calculators
- and others...
- Lot of interest in this, either for small or macro-molecules
- Richard S been using GLMol
- Richard S / James L / Dave M (& others?) to share experiences with various viewers
- Maybe easier with the new JS-based views and quickforms
- James L showed the multi-molecule sketcher from Lilly. Everyone else wants it too!
Bitvector OR / AND /XOR/NOT
- Nodes needed – SDR to investigate which nodes where in development at Vernalis, as we already have 1 fingerprint node released – these are relatively easy pickings to add to our contribution!
- Lhassa have these nodes for BitSets, but BitVector column implementations would be useful
- Are there BitVector aggregators in the GroupBy node? SDR Will look into adding aggregators for them too.
HELM & Biosequences
- Vernalis looking at DNA / RNA / Protein Sequences based on corresponding BioJava sequence objects – still a work-in-progress
- James L suggested possible Lilly contacts
- Ultimately, depends on use-case
- Some discussion around a HELM or xHELM datatype
- Would need careful definition of what is required for the types
- Possible topic for KNIME partner meeting?
Node usage statistics
Lilly have implemented extensive logging of node configure /execution events – might be able to share stats, or implementation details?
- Easier node creation for developers
- a graphical interface for drag-and-drop dialog creation, which also handled settings models in NodeDialog and NodeModel classes
- Alternatively, an Add SettingsModel... menu option, which deals with the creation / load/save/validate methods automatically to reduce typing?
- Wizard for column rearranger or full-blown execute methods
- Enhancement to current wizard, which asks for number of ports, number of views, port types, and provides appropriately ‘adjusted’ code in the node model / node factory classes
- Workflow preferences
- E.g. default settings for automatic chemistry type conversions
- Fulltext search in node repository
- Integration of GLMol in WebPortal
- More than one report per workflow
- Reports distributable in meta/subnodes
- Easier molecule input quickform (currently needs at least two nodes)
- "Linked" workflows in local workspace that can be synchronized with server
- Select all nodes between two selected nodes in workflow editor