What’s New in KNIME Analytics Platform 3.6 and KNIME Server 4.7

This year's summer release, on July 11, 2018, is a major KNIME® Software update. Here we highlight some of the major changes, new features, and usability improvements in both the open source KNIME Analytics Platform and our commercial products.

You can upgrade an existing KNIME Analytics Platform 3.x version either by opening KNIME and choosing the Update option in the File menu or by downloading it from scratch from the download page.

KNIME Workflow Hub

KNIME Analytics Platform

KNIME Server

General release notes

See the full list of changes in the changelog

KNIME Workflow Hub

KNIME Workflow Hub is the place for the KNIME community to share workflows. Here you can browse all of our example workflows. This will also be the place for community members to share their own workflows. Show your appreciation for community members by adding ratings, or comment on why the workflow is so great! We are opening KNIME Workflow Hub to community contributors in a staged manner. If you would like to be put on the list for early access, please get in touch.

KNIME Deep Learning

  • From simple to complex: Set up deep network architectures without writing a single line of code using KNIME nodes.
  • Use regular Tensorflow models within KNIME Analytics Platform and seamlessly convert from Keras to Tensorflow for efficient network execution.

Constant Value Column Filter

The new Constant Value Column Filter node removes input table columns where the values are all the same. Examples of uses include removing columns containing only zeros, identical strings, or missing cells.

Numeric Outliers

The Numeric Outlier node is a convenient way to remove suspicious or incorrect numerical data from your dataset. Various outlier treatment and detection options, e.g. detecting outliers only within their respective groups, give this node a high level of flexibility.

Column Expressions

The new Column Expressions node allows you to add new columns to a table or replace existing columns using expressions that are executed row-wise on the input table. These expressions can be as simple as a single function call, but they can also be as complex as you like. A library of predefined functions for things like string manipulation and mathematical formulas makes it easy to construct expressions.

Scorer (JavaScript)

The new Scorer (JavaScript) node is an enhanced version of KNIME’s Scorer node. In addition to providing access to more detailed statistics about prediction accuracy, the new node can interact with other JavaScript views in a wrapped metanode view or KNIME WebPortal page.

Git Nodes

The new Git nodes allow you to work with local Git repositories. These nodes can be used to

  • find commits using different criteria like author, branches, …
  • find branches containing specific commits
  • tag commits
  • retrieve details about commits like author, date, affected files, ...

Call Workflow (Table Based)

The new Call Workflow (Table Based) node makes it easier to call other workflows using an entire KNIME table. A caller workflow can send a table and flow variables to a callee workflow and receive a table from the callee via the new Container Input/Output nodes:

  • Container Input (Table) - Receives a table from a caller workflow
  • Container Input (Variable) - Receives flow variables from a caller workflow
  • Container Output (Table) - Sends a table to a caller workflow

KNIME Server Connection

The KNIME Server Connection node allows the user to connect to a KNIME Server. After a connection has been established, all of the remote file handling nodes can be used with the connected server. The server connection can also be used together with the Call Workflow (Table Based) node in order to run workflows that are shared via a KNIME Server.

Text Processing

Besides lots of smaller fixes and enhancements we added two new nodes to the Text Processing extension:

  • Dictionary Tagger (Multi Column): You can now use multiple columns from one data table as independent dictionaries to tag terms in documents.
  • Term Neighborhood Extractor: extracts the left and right term neighborhood of all terms in documents. You can specify how many term neighbors on the left and right will be extracted. This allows you to do things like find all neighboring adjectives of nouns or all neighboring verbs of tagged entities.

Usability Improvements

Connect/Unconnect nodes using keyboard shortcuts

It is now possible to connect and disconnect nodes with the context menu. Select the nodes you want to connect and select connect from the context menu or press CTRL+L. Disconnecting works the same way.

Zooming

Zooming is now also possible with the keyboard and mouse wheel. Use the combination of CTRL and mouse wheel to zoom in and out. A more fine-grained zoom can be achieved with CTRL ALT and the mouse wheel.

Replacing and connecting nodes with node drop

You can now replace nodes via drag-and-drop. Drag a non-connected node onto another node to replace it. You can also drag your node onto a connection to place it between two nodes.

If a search in the node repository returns no results, the node repository now displays a message telling you nothing could be found.

Usability improvements in the KNIME Explorer

It’s now possible to double-click items in the KNIME Explorer tree to expand/collapse the item; a double-click also allows you to see the Examples server, or log in to a KNIME Server. We’ve also improved the look of the permissions dialog, and ironed out a couple of other UX issues with that component.

Copy from/Paste to JavaScript Table view/editor

You can now easily copy and paste data in the Table View and Table Editor JavaScript nodes. Select a range of cells in the Table View and use standard keyboard shortcuts to insert this data into the Table Editor. This feature works nicely together with Microsoft Excel or Google Sheets.

Miscellaneous

Performance: Column Store (Preview)

As a first step in a series of planned performance improvements for KNIME Analytics Platform, a feature preview is now available for the KNIME column storage based on Apache Parquet. This extension stores internal KNIME tables in a format that is faster to access and compress, resulting in faster run times of many KNIME nodes when processing large amounts of data.

Making views beautiful: CSS changes

All JavaScript views now support custom styling using CSS rules. You can simply put these rules into a string and set it as a flow variable 'customCSS' in the node configuration dialog. The list of available CSS classes can be found on our documentation page.

KNIME Big Data Extensions

Create Local Big Data Environment

The new Create Local Big Data Environment node creates a fully functional local big data environment including Apache Spark, Apache Hive and HDFS. It allows you to try out the nodes of the KNIME Big Data Extensions without a Hadoop cluster. This is illustrated in the section below on KNIME H2O Sparkling Water Integration.

KNIME H2O Sparkling Water Integration

  • Scale-up model training and prediction using KNIME H2O Sparkling Water integration.
  • Seamlessly change workflows integrating KNIME H2O Machine Learning to be executed on your Apache Spark™ cluster.
  • Efficient scoring using H2O MOJOs on your local machine or with Apache Spark.

Support for Apache Spark v2.3

With this release we add support for Apache Spark™ 2.3. The only thing you need to do in order to reuse your existing big data workflows on a Spark 2.3 cluster is change the Spark version in the Create Spark Context node to 2.3 and re-execute your workflow. Who knew that upgrading to a new Spark version could be that simple?

Big Data File Handling Nodes (Parquet/ORC)

You can now read and write Parquet and ORC files with KNIME. The new Parquet and ORC Reader and Writer nodes can use your local disk, HDFS, or S3.

Spark PCA

We have redesigned the dialog of the Spark PCA node to provide the same functionality and user experience as the standard KNIME PCA node.

Spark Pivot

If you like the existing Spark GroupBy node you will love the new Spark Pivot node! Not only does it support all the existing aggregation functions but it also allows you to either let Apache Spark automatically detect the pivot value, to specify them via an input column, or to enter them manually.

Frequent Item Sets and Association Rules

The Frequent Item Sets and Association Rule nodes are the latest addition to the Apache Spark MLlib node collection for the KNIME Extension for Apache Spark.

Previews

Create Spark Context via Livy

The new Create Spark Context via Livy node allows you to run all Spark nodes using Apache Livy, which is a REST service to interact with Apache Spark™. It provides out-of-the-box compatibility with Hadoop distributions that include Livy, such as Hortonworks HDP® (v2.6.3 and higher), or Amazon EMR (v5.9.0 and higher), without the need to install any further software on the cluster. The node also features a dialog that makes it easy to control the cluster resources of your Spark context.

Database Integration

The new database framework is not yet feature complete but it already contains many new cool features that we want to share with you and get your feedback on. The new framework already contains all nodes necessary to visually interact with your favorite database. It comes loaded with new features for:

  • Usability
    • Flexible type mapping framework
    • Improved database schema handling
    • Advanced SQL editor with syntax highlighting and code completion
  • Reliability
    • Dedicated transactions
    • Flexible driver management
    • Improved connection management
  • Performance
    • Connection pooling for parallel execution
    • Streaming support of all reader and writer nodes

This is a great opportunity for you to test drive the new nodes and to give us feedback.

Apache Kafka Integration

Apache Kafka® is an open source publish-subscribe messaging system focusing on high performance, vertical scalability, and fault tolerance. This first version of KNIME’s Apache Kafka Integration ships with three new nodes:

  1. Kafka Connector - Connects to a Kafka cluster
  2. Kafka Consumer - Consumes messages from Kafka (topics)
  3. Kafka Producer - Sends messages to Kafka (topics)

KNIME Server

Management (Client Preferences)

Management (Client Preferences) makes IT operations easier by centrally managing KNIME Analytics Platform preferences. This allows you to define preference profiles for different use cases. Preference profiles can consist of preference files, scripts, and database drivers which can then be deployed to Analytics Platform on-request, or automatically. Your end users no longer need to make changes to their preferences to work with company resources.

Job View (Preview)

The Job view allows users to look behind the scenes of a workflow running on KNIME Server which we call a job. Although it is still a preview, it already possible to do a lot of useful things, such as:

  • inspect the workflow state, including things like each node's message (warnings and errors), state and progress
  • reconfigure nodes and wrapped metanodes
  • visualize data, and views
  • reset, execute and cancel nodes

The KNIME Server preview guide has full details of installation and functionality.

Distributed Executors (Preview)

One important aspect of our work to make KNIME Server more scalable is the KNIME Distributed Executors. With this release the Distributed Executors are almost feature complete, the only missing functionality is file up- and download in WebPortal jobs (using the respective QuickForm nodes). It’s now possible to run any KNIME workflow that does not contain either of these QuickForms on KNIME Server using distributed executors and scale the concurrent execution of workflows well beyond what was previously possible. This is still preview functionality, but we expect it to be feature complete and ready for production use in December in our second annual release.

General release notes

JSON Path library update

We have updated the JSON Path library json-path from version 2.0.0 to 2.4.0. This version fixes several issues but also slightly changes results for certain JSON Path expressions. Most notably the behaviour of the deep scan operator "..", which now behaves the same as the array wildcard operator "*" when no matches are found inside a child object. Consider the following example:

[
 { "a": "b",
    "c": "d" },
 { "a": "e" }
]

Using the JSON Path $..c used to result in ["d"] whereas the equivalent query $[*].c resulted in ["d", null]. Now both paths will results in ["d", null].

Java Snippet Bundle Imports

In addition to being able to specify external Jar files in the Java Snippet node it’s now also possible to select one more more installed bundles (plug-ins). The classes from these bundles can then be used in your snippets.