What's New in KNIME Analytics Platform 4.2 and KNIME Server 4.11.0
For the KNIME User
- Amazon DynamoDB Extension (Labs)
- SAP Reader (Theobald)
- New File Handling Framework (Labs)
For Enterprise Data Science Challenges
- Executor Reservation
- Auto Scaling on AWS (BYOL & PAYG) with KNIME Server Large
- Hybrid Deployments
- KNIME Server Large BYOL on AWS
- Improved Monitoring
KNIME Hub: Spaces
On KNIME Hub you can now create multiple Spaces, name them as you like, and define whether they should be public or private.
Here are tschmidt’s Spaces. He's created public Spaces for different projects that he’s working on with other people: BIRT reporting, Clustering Analysis, and a Churn Prediction Analysis project. He's also created a private Space, My Private Analysis, for an individual project, which is indicated by the lock icon.
Note: you need to sign in to see your Spaces.
You can now also show your appreciation for content provided by other people. In addition to adding comments, you can now like workflows, nodes, components, extensions, and Spaces. This can also all be done from within KNIME Analytics Platform.
KNIME Hub: User Profile Pages & Community
Introducing the new user profile page: A central spot to find your own Spaces and extensions as well as those of others. Your user profile page has three sections:
- Overview: your own statistics
- Extensions: if you’ve developed extensions, these will appear here
- Spaces: the various Spaces you've created
Here is tschmidt’s profile page. He's published 21 workflows, 6 components, and has received 5 likes!
The Extensions section shows any extensions that you have uploaded. Here, for example, is Redfield’s page. Redfield is a KNIME Partner, who has developed different KNIME Extensions.
New Connector Nodes and File Handling Capabilities
KNIME Labs now provides a set of nodes to connect to Salesforce - the Salesforce Integration - to read data from one of the most popular CRM systems. Authentication can be done via OAuth2 interactive web authentication or by providing username and password. Using the Salesforce Object Query Language (SOQL) the user can construct simple but powerful query strings to define the data to read.Install Integration from KNIME Hub
Amazon DynamoDB Extension (Labs)
SAP Reader (Theobald)
We’ve partnered with Theobald Software, one of the world’s leading experts in SAP integration, to develop a KNIME node that allows you to extract data from various SAP systems. The new SAP Reader (Theobald) node connects to the commercial Theobald Xtract Universal Server to extract data from SAP S/4HANA, SAP BW, SAP ERP/ECC, SAP R/3, mySAP, etc. into KNIME. Note that the SAP Reader can only be used with a Theobald Xtract Universal Server license that includes the generic web service destinations.Example workflow on KNIME Hub
New File Handling Framework (Labs)
Here is the first set of nodes in the new File Handling Framework. This will make it much easier to work with files within KNIME whether local, on KNIME Server, on an on-premise shared driver, or in cloud storage.
We’ve also revised many of the existing reader and writer nodes to be compatible with the different file systems. We’ve improved their usability and added new features such as a file preview or reading multiple files in directories.
When sharing workflows you no longer need to write KNIME URLs when referencing files and folders in a KNIME Server repository, workflow repository, or the workflow itself. Instead, select the starting point and browse for the files you want. KNIME workflows now provide a dedicated data area, which will make it easier to share data with your workflow.
The new nodes are in the KNIME Labs -> File Handling (Labs) category. More nodes and file systems will come soon. So give it a try and provide us with lots of feedback!
TensorFlow 2 Integration
The new TensorFlow 2 integration enables the use of the TensorFlow 2 Python package with a few dedicated nodes and the DL Python nodes.
The TensorFlow 2 Network Reader node can read networks from the SavedModel format and the H5 format from an arbitrary source. TensorFlow 2 networks can be easily assembled from TensorFlow Keras layers or carefully designed to perform complex functions with custom layer implementations and powerful TensorFlow-Hub layers (e.g. BERT). After fitting the network to the training data with the DL Python Network Learner, the TensorFlow 2 Network Executor can be used to run the prediction for unseen data. The trained network can be saved with the TensorFlow 2 Network Writer node to be used in other workflows or applications.
Find out more and download the nodes from the KNIME Hub. And view these example workflows to see how they are used.
Performance improvements mean that executing Python nodes is now much snappier, which is particularly relevant for situations where multiple Python nodes follow each other, are inside a loop, or in the case of node executions that don’t involve lots of data. Startup time has been noticeably reduced when executing any Python scripting nodes, the deep learning Python scripting nodes, Keras nodes, and the MDF Reader node.
This has been realized by two key changes: (1) KNIME now initializes and maintains a pool of Python processes in the background for use by individual Python nodes; and (2) for the many users employing the popular Anaconda or Miniconda Python distributions, KNIME now invokes Python directly without the costly delay of first running a configuration script.
Simple File Reader
The Simple File Reader is a viable alternative to the File Reader. It does have a slightly smaller set of options available, but performs better. The node is on average three times faster and, for certain inputs, can achieve tenfold speedups over the more generic File Reader node.Example workflow on KNIME Hub
A new Joiner node has been released in Labs. Benchmarks have shown speedups for in-memory joins of around 2x-6x compared to the previous implementation. It also provides new functionality:
- Split output: The joiner can be configured to output the matched and unmatched rows (left and right) in separate output ports and makes checking both data quality and joiner configuration easier.
- Merge join columns: When using outer joins, the joiner can be configured to merge join column pairs into single columns, resulting in more concise output. This eliminates the need for downstream column merge nodes and also avoids a common pitfall with the default configuration of the previous Joiner node when doing outer joins.
In this release, we’ve added new nodes and worked on improving the user experience with a number of existing nodes, for example:
Table Difference Finder
This new node allows you to compare the structure and data of two KNIME data tables and to output their differences.Example workflow on KNIME Hub
String Manipulation (Multiple Column)
Similar to the Math Formula (Multi Column), new String Manipulation (Multiple Column) node allows you to automatically perform a string manipulation on multiple columns.Example workflow on KNIME Hub
This Tableau integration builds on the new Tableau Hyper API and ensures better compatibility and effortless installation.Install integration on KNIME Hub
Buidling on the integration released in December 2019, the dialog in the PowerBI Extension automatically only shows compliant dataset, i.e., push datasets and shows an information if the workspace contains non-compliant datasets.Install extension on KNIME Hub
All R nodes now allow you to set an R environment different to the default environment via the “Advanced” tab.
Various improvements have been implemented around our Database nodes, which were requested by our users. Improvements include bulk copy support for Microsoft SQL Server when using the DB Loader node, support for boolean columns in the DB Row Filter node, and allow users to open the dialog of most database nodes without a valid database connection or when working with the Remote Workflow Editor.
The new Integrated Deployment Extension allows you to capture the parts of your workflow needed for running in a production environment. These captured subsets hold all the relevant settings and transformations. They can be combined with one another, executed, saved as local workflows, and deployed to a KNIME Server. Integrated Deployment enables you to:
- Automate your workflow design and deployment
- Reduce risks of error associated with moving from model creation to deploying production processes
- Update your production workflows automatically
Download this example workflow, which features the new nodes for Integrated Deployment from the KNIME Hub.
Find out more on the Integrated Deployment web page, or read the blog articles:
- How to move data science into production.
- Integrated Deployment Blog Series: Episode 1: An Introduction to Integrated Deployment.
- Integrated Deployment Blog Series: Episode 2: Continuous Deployment
Elastic and Hybrid Execution
KNIME Executors actually run workflows. For example when a user starts a workflow on KNIME Server, via a REST API, scheduling, or the WebPortal. With the release of KNIME Server 4.11, the capabilities of KNIME Executors have been extended. The new features include Executor Groups and Executor Reservation, AutoScaling on AWS, and Hybrid Installation. This allows us to meet the requirements of corporations looking to serve large numbers of diverse users.
Executor Groups are a new feature that enhances KNIME Executors, enabling users to create dedicated groups of executors for different departments or business units. Workflows get automatically routed to the correct group, which has the appropriate level of resources for that specific business unit. Each group can contain one or more executors.
Executor Groups simplify the scaling process. It's now possible to have your entire organization on a single KNIME Server installation, with a distinct Executor Group for each organizational entity. See the graphic below showing Executor Groups for the Marketing, Finance, and Engineering divisions, for example.
With Executor Reservation, you can ensure that jobs with special requirements are sent to a suitable executor. This goes beyond the already existing workflow pinning, since executors can now refuse to accept jobs unless certain requirements are met. This way, you can ensure that special resources are not blocked by workflows that don’t actually need them. For example there is no need to block a powerful GPU executor with basic ETL jobs.
Auto Scaling on AWS (BYOL & PAYG) with KNIME Server Large
KNIME Executors have now been added to the AWS Marketplace as bring-your-own-license (BYOL) and pay-as-you-go (PAYG) allowing you to automatically and dynamically start up new Executors. Using AWS Auto Scaling, you can detect if new KNIME Executors need to be started, e.g. in the case of heavy system load (scale out). Conversely, Executors are shut down again once load decreases (scale in). In the case of PAYG, dynamic scaling makes it possible to save money on both instance and license costs.
The new marketplace offerings also introduce the possibility of running a hybrid on-prem/cloud setup. You can install KNIME Server on premise and start up Executors in the cloud (on-demand PAYG) as needed. This allows you to start special-purpose Executors (e.g. GPU) without having to maintain specialized hardware year round.
KNIME Server Large BYOL on AWS
KNIME Server Large has been added to the AWS Marketplace as a BYOL service. This can be used as a base image to which on-demand pay-as-you-go executors can be added for elastic scaling.
We have improved logging to better monitor levels of activity among KNIME Executors and Executor Groups.
To support governance, compliance and reporting requirements, the Workflow Summary feature extracts and exports all the details about your workflows out of KNIME - from execution information, workflow metadata and node settings through models all the way to details about data sources and fields used and created. Workflow Summary can be extracted from the pulldown menu or, by involving KNIME Server you can automate extraction of the workflow summaries and generate regular reports. When integrating the KNIME WebPortal, your report creation can be interactive.
A workflow summary can be exported as a single JSON or XML document. Highlights of the information extracted include:
- Runtime environment (installed plugins, system properties)
- Workflow metadata
- Workflow annotations
- List of all contained nodes with properties like
- Name, type, state, …
- List of node outputs and their spec
- List of node successors (by node id)
- A summary of the output data (if desired and available) - but not the data itself
- It can be generated either
- Via File-menu entry
- Server REST-endpoint to generate the summary for jobs
- It can be generated either
Examples on the KNIME Hub include a REST API automation example as well as a reporting and an interactive example, which use a component to extract information from the summary. These can be modified and extended to suit the needs of your organization. Download example workflows from the KNIME Hub.
The KNIME WebPortal is KNIME’s main end user application for Guided Analytics and has now been newly implemented and redesigned with a focus on user interface and user experience. A large majority of Widget nodes have now been rewritten to give them a fresh look and feel. Colors, fonts, and logos can easily be adjusted with the new theming capabilities. This allows for consistent branding and styling of the new WebPortal and widgets. Note that existing workflows are backwards compatible with the new WebPortal and widgets. Also: the previous WebPortal (v4.10 ) is still available, but you can start using the new WebPortal alongside it!Learn more
In December 2019, we first introduced the ability to use Single Sign-On (SSO) by authenticating users with an Identity Provider with OAuth/OpenID Connect capability. KNIME Server 4.11 now adds a great number of features to the available capabilities. For example:
- Automatic OIDC endpoint discovery
- Claim to group mapping (a configurable claim from the userinfo endpoint can now be configured to be used as group information)
- Landing page for OIDC login
In previous releases, the application server component of KNIME Server was based on Apache TomEE - the Java Enterprise Edition of Apache Tomcat. With this release, TomEE is replaced by the standard Apache Tomcat. In line with the switch to the standard Apache Tomcat, the use of the old EJB mountpoints to connect to KNIME Server is now discouraged in favor of the newer REST implementation. We've made migrating to a REST mountpoint very simple. Just log in with an existing EJB mountpoint and there'll be a prompt from which you can switch to REST with a single click. REST provides many benefits in comparison to EJB - particularly advantages in terms of performance and stability.
Switch to Qpid
Previous releases of KNIME Server used RMI to establish a connection between an application server and executor. RMI has now been replaced by an embedded message queue based on Apache Qpid. Rather than having direct communication between application server and executor, events such as requests for job execution are passed on via the message queue. Qpid technology comes bundled with the KNIME Server installer, so no additional setup is necessary.
Community and Partner Extensions
Here are the new Community and Partner Extensions available in KNIME Analytics Platform 4.2:
Conformal Prediction Extension by Redfield
Cumulocity Connector extension by tarent
The Cumulocity Connector extension developed by our partner tarent allows you to connect to the Cumulocity IoT platform by Software AG from KNIME Analytics Platform. To learn more about this extension take a look at the blog article: Cumulocity, IoT, and KNIME Analytics Platform.Install extension from KNIME Hub
General Release Notes
Changes to Java Runtime and KNIME Analytics Platform Upgrade Procedure
KNIME Analytics Platform 4.2 is distributed with an updated version of Java (18.104.22.168-b09) and Eclipse (2020-03 / 4.15). Due to these changes, updates from previous versions of KNIME Analytics Platform via the KNIME update manager are not supported. A full new installation is required.
Changed or Removed Functionality
- Windows 32bit support discontinued due to the Eclipse update (see above).
- Chromium binaries are no longer pre-installed on MacOS due to Apple’s notarization process. They can be installed separately using “File” -> “Install KNIME Extensions”
- Spotfire (Labs) extension was removed. The functionality is now available through the the partner extension provided by Tibco, which can be found under “KNIME Partner Extensions”
KNIME Analytics Platform 4.2 no longer uses “Buddy Classloading” to find nodes and types in third party extensions. Nodes and types need to be properly registered through the respective extension point. For an exemplary fix of this issue, see this code change. Failing to do so will likely cause (old) workflows containing those nodes to load these nodes using a “missing node placeholder”.
The node extension API itself has remained stable. However, changes due to the update of Eclipse may trickle down into third party extensions (e.g. dependencies to Apache Batik or similar). In most cases, a simple fix of the dependency declaration fixes the problem. If these problems are more elaborate, then please post them in the developer forum.