What's New in KNIME Analytics Platform 3.5, KNIME Server and KNIME Big Data Extensions

This year's winter release, on December 6, 2017, is a major KNIME® Software update. Here we highlight some of the major changes, new features, and usability improvements in both the open source KNIME Analytics Platform and our commercial products.

You can upgrade an existing KNIME Analytics Platform 3.x version by choosing the Update option in the File menu or by downloading from scratch from the download page.

KNIME Big Data Extensions

KNIME Analytics Platform

KNIME Server

Additional notes about the 3.5 release

See the full list of changes in the changelog

KNIME Big Data Extensions

The biggest news here is that the KNIME® Big Data Extensions are now part of the free and open source KNIME Analytics Platform. This collection of nodes provides access to the power of Apache Hadoop and Apache Spark from within our familiar and easy-to-use analytics environment.

The Create Spark Context Node

The Create Spark Context node now allows users to enable automatic Apache Spark context deletion when the workflow is closed.

The Spark Missing Value Node

The new Spark Missing Value node provides several data-type specific strategies to eliminate missing values. The missing value strategies are exposed as a PMML model that can be applied to other Spark DataFrames or standard KNIME data tables.

The Spark GroupBy Node

The new Spark GroupBy node uses the same user interface as the standard GroupBy and Database GroupBy nodes. In addition to the standard methods, the node provides access to some Apache Spark-specific aggregation methods.

KNIME Analytics Platform

JavaScript Views

We have added three new JavaScript views to KNIME:

  • Tag Cloud: provides an interactive tag cloud. Because the view uses SVG, the text is even selectable!
  • Data Explorer: displays summary statistics about the columns in a KNIME table and allows you to filter out uninteresting columns
  • Table Editor: this version of the Table Viewer allows you to edit the values in the table. These changes are then available to other nodes in your workflow

New Integrations

KNIME Deep Learning - Keras Integration

With the KNIME Deep Learning - Keras integration, we have added a first version of our new KNIME Deep Learning framework to KNIME Labs. KNIME Deep Learning is a general, backend-independent, deep learning plugin for KNIME Analytics Platform which allows users to read, create, edit, train, and execute deep neural networks within KNIME Analytics Platform. Through Keras, users have access to a variety of different state-of-the-art deep learning frameworks, such as TensorFlow, CNTK, and others.

The new nodes are:

  • The DL Network Executor node for executing deep neural networks.
  • The DL Keras Network Reader node to read in pre-defined, potentially trained, Keras networks.
  • The DL Keras Network Learner node for training or fine-tuning deep neural networks within KNIME via Keras.
  • A set of nodes for flexibly creating, editing, executing, and training deep neural networks with user-supplied Python scripts.

Google Sheets Nodes

With our new Google Sheets nodes you can easily connect to a Google account and read data from a Google Sheet, write out information to new Sheets or modify existing Sheets. The nodes provide numerous options for reading or adding headers, substituting missing values, automatically opening your Google Sheet, and more. You can interactively login directly from the node configuration - this will redirect you to Google for confirmation - or, if you prefer, you can provide credential files.

Run R Model in Microsoft SQL Server

With the new Run R Model in Microsoft SQL Server node, you can use the power of SQL Server Machine Learning Services directly from KNIME Analytics Platform. This node allows you to execute R scripts right where your data is stored without needing to move data back and forth from server to client.

Improved H2O Integration

H2O is an open source machine learning and predictive analytics library with a strong focus on scalability and performance. A first collection of H2O nodes was included in last summer’s release of KNIME v3.4. This release contains a generally improved integration as well as a number of new nodes. Machine learning with H2O can now be done using k-Means, PCA, and Generalized Low Rank Models. A column filter for H2O frames is now also available.

Another new feature is the integration of H2O MOJO (Model Object, Optimized), which allows you to import an already trained H2O model into KNIME and use it for predictions. Several MOJO types are supported: Classification, Regression, Clustering, Dimension Reduction, Auto-encoding and Word Embedding.

Extensions that have “graduated” from KNIME Labs

KNIME Labs is a set of plugins that are experimental, still in active development, or that we think just need to mature a bit more before being fully production ready. We periodically review the plugins in KNIME Labs in order to identify those that we think are ready to “graduate” and become a standard part of KNIME Analytics Platform. Here’s the graduating class for this release:

  • Text Processing
  • REST Client Nodes
  • JavaScript View Nodes (note that there is still a set of new JavaScript views in KNIME Labs)
  • Python Integration, supporting Python 3

New Utility Nodes

The Math Formula (Multi Column) node

The new Math Formula (Multi Column) node allows you to apply a formula to multiple columns. Previously when you wanted to use the same formula (e.g., a custom normalization) for several columns you had to use several Math Formula nodes, now you can do this with a single node.

The OPTICS nodes

OPTICS is a clustering algorithm based on DBSCAN. Its implementation in KNIME Analytics Platform consists of two nodes that work in tandem: the OPTICS Cluster Compute node and the OPTICS Cluster Assigner node. The OPTICS Cluster Compute node creates an ordering of the data points based on two user-provided parameters and the OPTICS Cluster Assigner node uses this ordering to assign points to clusters and provide an interactive view of the clusters. The interactive view also allows you to adjust the threshold used to determine cluster membership.

The Window Loop Start node

The Window Loop Start Node introduces a new way to iterate over a table using specified window and step sizes. This gives full control over the chunks of rows that are returned in each iteration. The user can define window and step sizes either in terms of number of rows or in terms of date/time interval. Date/time intervals can be inserted using predefined time units or according to the ISO-8601 standard. Advanced options allow further adjustments of the window.

KNIME Server

OpenAPI definitions of individual workflows

A key functionality of KNIME Server is to allow individual workflows to be exposed as REST endpoints. In this release we have added OpenAPI (formerly Swagger) definitions of the exposed workflows and provided a SwaggerUI interface to document and test your web services.

As an ease-of-use feature we also added a “Show API definition” context menu item to the KNIME explorer. Selecting this option opens the SwaggerUI page for that service in your web browser.

More functionality exposed as REST resources

As part of our goal to expose all KNIME Server functionality via the REST API, we have added some new endpoints in this release:

  • Execute workflows with a GET request (as well as POST, which was previously possible).
  • Job Scheduling, including updating existing schedules, via REST API. There’s an example workflow in the default installation showing how to list all scheduled jobs.
  • Change workflow owner.
  • Upload Server license files via REST API.

Additional notes about the 3.5 release

  • Windows R Binaries updated (from 3.0.3 -> 3.4.2). Results may be different than before due to this update.
  • Because we’ve updated the version of Java we’re using (from 1.8.0_60 to 1.8.0_152), updating from a previous KNIME installation requires a “cold restart” of KNIME. Instead of using the automatic restart after an update, you should quit KNIME and restart it again manually.
  • We no longer provide 32 bit binaries for Linux.
  • We have stopped providing the “Full Build” version of KNIME Analytics Platform. The full builds have gotten so large (>2GB!) that it was no longer possible to create installers. All KNIME extensions are still accessible via the File->Install KNIME Extensions menu item or on the KNIME Update Site
  • Python: The nodes previously available through KNIME Labs are now KNIME’s default Python integration. If you have defined custom types in KNIME they now need to be registered through the extension point defined in org.knime.python.typeextensions
  • Server: See the KNIME Server Preview Guide for info on the three preview features (SWR, Workflow Job View, Distributed Executors)