KNIME Meets KNIME – Will They Blend?

Mon, 10/08/2018 - 10:00 admin

In this blog series we’ll be experimenting with the most interesting blends of data and tools. Whether it’s mixing traditional sources with modern data lakes, open-source devops on the cloud with protected internal legacy tools, SQL with noSQL, web-wisdom-of-the-crowd with in-house handwritten notes, or IoT sensor data with idle chatting, we’re curious to find out: will they blend? Want to find out what happens when IBM Watson meets Google News, Hadoop Hive meets Excel, R meets Python, or MS Word meets MongoDB?

Follow us here and send us your ideas for the next data blending challenge you’d like to see at willtheyblend@knime.com.

Today: KNIME Meets KNIME - Will They Blend?

Author: Phil Winters

The Challenge

Imagine you have been happily getting and using a new version of KNIME Analytics Platform with all of its additional features and functionality twice a year for many, many years.

But one day, you are required by your organization to pull out something from your distant KNIME past – something that at the time was very, very important and that needs to work EXACTLY the same way today.

But you’ve heard all the horror stories of other data science tools and platforms that have changed so fundamentally between versions (sometimes yearly) that a time-consuming migration (or even a rewrite) is required to get the old code to work. And of course, there is no guarantee from these vendors of backward compatibility nor that the results will be the same even when you do get it to work again. But what about KNIME? Will the old easily blend with the new? That is our challenge today!

Topic. Backward compatibility of KNIME Workflows

Challenge. Reuse the oldest KNIME workflow available in today’s current KNIME version

 

The Experiment

So how do you test backward compatibility? The KNIME commitment to backward compatibility is very clear. As a part of the KNIME integration and quality testing process, old versions of workflows all the way back to 2006 are fully integrated into the testing and QA process to ensure backward compatibility. But how do we TEST that ourselves? By finding an extremely old KNIME workflow that WE can look at!

In talking with the KNIME founders, the oldest exported workflow was surfaced by Bernd Wiswedel, CTO at KNIME and one who arguably wrote the first line of KNIME code (and definitely still understands it).

The workflow comes from version 1 of KNIME (1.2.0 from Feb 2007 to be exact). Classic data access, transformation, and machine learning tasks were all performed. A screenshot of the successfully executed workflow from that early version looked like this:

Figure 1: Workflow from Version 1.2.0 of KNIME, in February 2007

To run our test, the current KNIME Analytics Platform (3.6.0) was started and the 1.2.0 workflow was imported:

Figure 2: Importing Version 1.2.0 Workflow.

 

The Results

Below you can see the old workflow executed in the “new” KNIME Analytics Platform.

Figure 3: Running a 1.2.0 Workflow in KNIME 3.6.0.

The workflow completed successfully with no errors. No changes were required.

More importantly, however, note that many of the nodes are marked as “deprecated”. At some point between then and now, these nodes have been enhanced in such a way that there is now a difference. Instead of forcing you to change the workflow, KNIME automatically uses the old version of the node (yes, even back to 1.2.0) and simply indicates that the node is no longer the current version. In this way, the workflow not only continues to work but also ensures the exact same results as before.

This means you have the choice of either reusing the old workflow (with its deprecated nodes) or enhancing the workflow possibly to take advantage of newer nodes and functionality. The choice is yours. No migration, no upgrading, and the exact same results.

One thing we could have also done is save the workflow in an executed state and also include all data (and intermediate data). For absolutely guaranteeing data and model lineage as well as traceability and auditing, there is no better way.

KNIME from the past has been successfully blended with KNIME from the present…and of course for the future. Because backward compatibility of all KNIME maintained nodes is guaranteed. That is not only a great feeling to have personally but important for an organization by eliminating the costs and risks involved in a long-term commitment to KNIME Analytics Platform as it continues to evolve into the future.

Coming Next …

If you enjoyed this, please share this generously and let us know your ideas for future blends.