Data Privacy and KNIME

Mon, 10/08/2018 - 10:00 Peter

Author: Peter Ohl

We are often asked two things about data privacy in KNIME:

  • How are data handled in the open source KNIME Analytics Platform?
  • Who has access to the data that are processed?

Before diving into the details, let’s first put this into context:

KNIME Analytics Platform is open source and has a huge community contributing to its functionality by developing many Community Extensions.

At KNIME we check that all Community Extensions function properly, also putting the Trusted Community Extensions through even more stringent checks, however we do not inspect how Community Extensions handle data. The same is true for KNIME Integrations. KNIME Integrations are extensions that integrate other community code in KNIME Analytics Platform. Here too, we carefully test the functionality of these extensions, but we do not check how these packages handle data.

Please note that the following statements relate only to KNIME Analytics Platform and KNIME Extensions i.e. code that has been written by us and has been carefully reviewed by a KNIME Team member.

 

Question: Does KNIME, the company, have access to the data I am processing with KNIME Analytics Platform? / Is there a chance that data could leak from KNIME Software to anybody outside my company / intranet?

The data that are processed locally on your machine using KNIME Analytics Platform / KNIME Extensions and all intermediate results are stored locally on this same machine (as are the nodes and / or parts of the workflow in the event that a remote executor is involved). Data are only stored in a different location if you personally decide to do so, for instance by using a File or Database Writer node to store data in an alternative location.

KNIME Software does not send any of that data over the internet to anybody unless you explicitly configure it to do so. KNIME (the company & its employees) does not have any access to the data you process and cannot track the data.

 

Follow-up Question: KNIME Analytics Platform is an open source project - how can you be sure?

In the past, many open source projects have been associated with anarchic contributions from anyone and software quality that could not be controlled. For KNIME this is not the case. All lines of code that are checked into the source base of KNIME Analytics Platform and KNIME Extensions are carefully reviewed by a member of the KNIME Team. If you would like to check this, you can inspect all of the code in the public repository including the history of changes. So even if we did attempt to sneak in “evil code”, someone in the community could and would notice.

 

Question: Is the data I process kept secret, is it protected, is access restricted?
The answers to the questions above should help with these questions, too. Remember as well, that the data you personally process with KNIME Software is under your control and it is your sole responsibility to keep them confidential, to protect and restrict access to them. You can create KNIME workflows that send data elsewhere e.g. remote files, external databases, even as an attachment of emails you send from a workflow, However you have to actively insert such a node into your workflow and configure it accordingly for that to happen. The data are entirely under the control of the workflow author.

 

Question: If I run KNIME in the cloud, will you have access to my data?

We offer KNIME Software through the marketplace of most cloud providers. We have no access whatsoever to your cloud instance or the data in it. If you use one of our pre-configured images, you receive a copy of it from the cloud provider. It is the responsibility of the cloud provider (and your own - see above) to restrict access to your cloud instance and to the data stored there.
 

And since this comes up often as well:

Related Question: Can KNIME claim any ownership to the workflow and / or the results / reports I create with KNIME Analytics Platform?

In short: No! KNIME, the company, does not own anything you create through the use of KNIME Software. The workflows, the predictive models, the data, and any discoveries that result from the use of the software are your intellectual property.