What's New in KNIME 2.11

 - Database and Big Data Extension  
        - Database GroupBy node now with database specific aggregation methods
        - Drop Table (New Node)
        - HP Vertica Connector (New Node)
        - Impala Connector / Loader (New Nodes)
        - HDFS Connector / File Permissions (New Nodes)

- Tool Integration
        - New Python Integration (New Nodes)
        - JSON Processing (New Category)

- Data Mining
        - PMML: New nodes to implement modular PMML
        - DBSCAN (New Node)
        - kNN now supports more distance functions (New Node)
        - Target Shuffling (New Node)

 - IO
        - Writer Nodes Improvements

- Other
        - Quick Node Insertion with Ctrl-Space (New GUI feature)
        - Table Validator (New Node)
        - Column Auto-Type Cast (New Node)

 - See full list of changes in changelog file


Full recording of webinar "What's new in KNIME 2.11" available at: http://youtu.be/9RkRHI32Dy8

Database and Big Data Extension

video http://youtu.be/CBoW2S91e58

Database GroupBy

The Database GroupBy node now offers database specific and parameterized aggregation methods. It also allows for dynamic aggregation column selection based on column name pattern or data type.

Drop Table

This node drops objects and tables in a database. It also provides a handful of options to gracefully handle missing tables.

HP Vertica Connector

This node connects to an HP Vertica database. The node outputs a connection to the selected HP Vertica database.

 

Impala Connector / Loader (Cloudera certified, commercial license required)

The Impala Connector node connects to an Impala database.

The Impala Loader node creates a new table in the Impala database.

 

HDFS Connector / File Permissions (commercial license required)

The HDFS Connector node connects to the HDFS Hadoop Distributed File System.

The HDFS File Permission node sets permissions for further operations on the files in HDFS.

 

Tool Integration

New Python Integration

video http://youtu.be/3dHufC6iQgw

This new Python integration is based on CPython (and not JPython as in the old nodes).

It requires some modules in the Python installation:

  • PANDAS for data representation
  • Protobuf for the communication between CPython and KNIME
  • and optionally Jedi for the auto-completion feature in the Python editor in the nodes configuration window

 

 

 

JSON Processing

video http://youtu.be/XndgaTC3UWY

Many new nodes to process JSON structures:

  • read JSON structures
  • extract content from JSON structures
  • JSON schema validation
  • ascertain differences between 2 JSON objects
  • transform and convert to/from other objects, esp. XML

Data Mining

video http://youtu.be/m8tvrHe6urc

PMML: New Nodes for Modular PMML

video http://youtu.be/yQP0NImUCes

Three new nodes to implement modular PMML, that is to assemble transformations and models into a PMML structure piece by piece avoiding repetitions.

  • Empty PMML Creator node implements an empty PMML structure with just a header including copyright, description, and user defined annotations.
  • PMML Transformation Appender node appends a transformation into an existing PMML structure. The resulting PMML includes the previous PMML plus the injected transformation
  • PMML Model Appender node appends a model into an existing PMML structure. The resulting PMML is the original PMML plus the appended model.

 

 

 

DBSCAN

New node to implement the DBSCAN (density based clustering) algorithm.

The node has two input ports: one for the data and one for the distance formula. With the appropriate distance function, DBSCAN is able to cluster oddly shaped groups of data.

 

kNN now supports more distance functions

k Nearest Neighbors (Distance Function) implements kNN and supports more distance functions, besides the Euclidean function. The node provides an additional input port for the distance formula.

 

Target Shuffling

This node shuffles the values randomly inside a selected column to assess the statistical accuracy of data mining results.

IO

Writer Nodes Improvements

video http://youtu.be/wsL1UTzEg-0

Supported Output Locations

  • Local System File Paths
  • Remote URLs
  • Workflow relative URLs

Configuration Dialogs

  • More consistent configuration windows across writer nodes
  • Better flow variable support
  • Info on destination location status

 

 

Other

Quick Node Insertion with Ctrl-Space

There is now help to quickly find and insert one or more nodes in the workflow.

  • Type Ctrl-Space or click the lens on the left of the search box above the Node Repository in the KNIME workbench.
  • Search for your node(s) by typing the node name or part of the node name in the search box.
  • Select your node(s) and press Enter or click "OK" to insert in your workflow.
  • Keep searching for and adding more nodes

 

 

Table Validator

The Table Validator node checks the input table format against missing values, out of domain values, and so on. using a reference data table to prepare data for reports and other workflows.

 

Column Auto-Type Cast

This new node tries to guess the most fitting type for a specified column. It is useful after a Transpose and before transforming a data cell into a flow variable.

Many more small improvements have been made under the hood - please refer to the changelog file.