KNIME Modern UI Preview (Labs)
KNIME Analytics Platform is getting a makeover. This release includes an extension that previews the new interface. Simply click the “Open KNIME Modern UI Preview” button in the top right corner to check it out.
The biggest changes you’ll find so far:
- The upgraded look and feel.
- The node repository now has filtering and advanced searching capabilities. Nodes are also displayed in the repository just like they are in the workflow, with the default ports.
- A new workflow breadcrumb that allows users to browse their workflow content.
This extension is considered a preview and is currently under development. We are keen to provide you a better user experience. Please share your feedback on what’s working and what needs more changes in the KNIME forum. For in-depth technical information, read our documentation.
New Visualization Nodes in KNIME (Labs)
Brand new visualization nodes for exploring data and building data apps are available as a preview in the KNIME Views (Labs) extension. These nodes replace four previous visualization nodes and offer a more consistent experience. We plan to replace even more, upon feedback from the community.
Major features include:
- A live preview of the visualization next to the configuration dialog.
- Node descriptions are now displayed beside the settings dialog.
- Any settings controlled by flow variables are displayed and automatically indicated with an icon.
Bundled Python Environment
We’re continuing to improve the experience for Python scripters. The KNIME Python (Labs) Extension now contains its own Python Environment so that you can get started with Python scripting in KNIME right away—no additional software installation is needed.
For configuration, there is a new Python (Labs) preference page that lets you choose the new “Bundled” option, or the previously available “Conda” or “Manual” Python environment. For a full list of included Python packages, see the documentation.
Pure-Python KNIME Nodes (Labs)
This release marks the first time KNIME nodes can be written completely in Python and can be shared within teams, just like other KNIME extensions. This includes node configuration and execution as well as dialog definition and node views.
To help you with designing these nodes, we introduce a Pythonic API and debugging capabilities within KNIME. We also provide the means to deploy pure-Python KNIME nodes—including their Python environment needed for execution—using a locally built update site. See a simple example for defining a full-fledged KNIME node in this demo video.
Snowflake H2O Machine Learning Model Push-Down
Business users with little or no coding experience—SQL or otherwise—have been able to gain insights on Snowflake data using KNIME’s intuitive low-code/no-code interface. With this latest release, KNIME Analytics Platform now supports push-down of H2O models directly into Snowflake. This means that users can now build machine learning models using Snowflake data and then even execute these models in Snowflake. This allows you to get predictions on large amounts of data in seconds as the data stays in Snowflake.
DB Framework Enhancements
With the KNIME Database Framework, you can utilize the processing power of your database by pushing down the execution where the data resides—all by visually building SQL statements within KNIME.
First, we’ve improved the connectivity, using the KNIME Database Framework:
- Getting started with Oracle Databases is now much easier since all required database drivers are integrated and we now support Kerberos-based authentication on the KNIME Server.
- Built-in drivers are updated for improved security and additional functionality for Amazon Redshift, H2, Microsoft Access, MySQL, PostgreSQL, SQLite.
Additionally, based on popular requests, we’ve extended the visual query and data manipulation capabilities with four new database nodes:
- The DB Concatenate node makes it easy to concatenate any number of database queries into a single query.
- The DB Looping node supports queries that match any value in a list of input values (e.g. IN queries).
- The DB Delete (Filter) node allows you to specify filter criteria to identify rows that should be deleted from a database table.
- Finally, the DB Data Spec Extractor node extracts the database table specification into a KNIME table allowing you to use this information in your analysis.
With the DB Looping node, we have migrated the last node of the legacy database framework. With the next release, we will deprecate the legacy database framework—if you are still using legacy database nodes, now is the time to migrate to the current framework.
For more information about the framework or the migration see KNIME Database Extension Guide.
Microsoft Azure Services
KNIME Analytics Platform allows you to seamlessly reach out to various Microsoft Azure services. With this release, you can now visually interact with serverless and dedicated SQL pools on Azure Synapse Analytics without the need to write any SQL statements using the KNIME database framework.
- Use the extended Microsoft SQL Server Connector node to connect to these pools and perform high throughput data uploading via the DB Loader node.
- Use the KNIME file handling framework to manage your data files in Azure Synapse storage accounts.
This release also expands on the SharePoint Online integration by adding support for creating new or deleting existing lists. Finally, the Microsoft Authentication node has been extended to support custom application IDs and authorization endpoints to meet your security requirements.
Column Expression and Multi-Row Formulas
Column Expressions is an all-purpose tool to compute new columns based on simple expressions. While previously limited to deriving these new values from the current row, a new function column(name, offset) has been added to read values from preceding and subsequent rows, effectively allowing multi-row formulas (find an example on KNIME Hub).
Additionally, new functions and capabilities make it easy to manipulate and create path cells and now also variables in KNIME Analytics Platform. New functions to create standard file system variables and cells allow you to create new paths directly within the nodes.
The XGBoost integration finally moves out of labs with three main updates:
- Row weights give you control over the weight the learners assign to individual rows of your dataset. This can be very useful to remedy class imbalances or to prioritize certain subsets of your data.
- Bit & Byte Vector Support improves the applicability of the nodes in domains like text processing or life sciences where it is common to have a vector representation of the data
- Feature Importance Output of the XGBoost Tree Ensemble Learner nodes is an output table that provides you with various metrics that indicate how important every single feature is to the learned model.
To see example workflows demonstrating this feature, visit the KNIME Hub.
Extended Spark Support
We have added support for Spark 3.1 and 3.2 including support for H2O Sparkling Water. The Create Local Big Data Environment that creates a fully functional big data environment for testing and prototyping locally, now uses Spark 3.2.
Reset Workflows on KNIME Server
It is now possible to reset workflows stored on KNIME Server directly from the KNIME Explorer without opening the workflow first. This new feature will save users a lot of time, especially when dealing with workflows that upload a large amount of data.
Release Technical Notes
- KNIME Analytics Platform 4.6 is based Java, 17.0.3. Prior versions (4.5 and 4.4) use Java 11. All extensions contributed by KNIME are updated to this new version, while some community and partner extensions are still being updated (and not yet available). For custom extension developers: Updates are often straightforward but sometimes require additional statements to be added to the knime.ini in case warnings regarding “reflective access” are raised (see GitHub code example).
- Linux only: restarts after upgrades from prior installations of KNIME Analytics Platform 4.5 to 4.6 cause a non-critical error on Linux when the user is prompted to restart the application. KNIME Analytics Platform needs to be started manually (the upgrade is still completed successfully).
- Windows and macOS: While the restart after an update works without an error message on Windows and macOS, some nodes will run into errors caused by the Java 17 update. This can be fixed by completely closing KNIME Analytics Platform and starting it again.
- Integrated Deployment: Workflows captured via Integrated Deployment nodes (Capture Workflow Start and Capture Workflow End) may output flow variables in a different order than previously (4.5.x and before), especially in the case when these workflows have external inputs (PortObject Reference Reader). The ordering now matches the order of flow variables in the original workflow (AP-18630).
- The Generic Web Service Client was updated to use a recent version of the Apache CXF library (AP-5490). While tests have not revealed any compatibility issues there are way too many flavors of web service to guarantee full test coverage.
- KNIME H2O Integration: Removed support for far outdated versions of the H2O library (126.96.36.199, 188.8.131.52). Models trained with one of these versions either need to be re-trained with a newer H2O version or converted to MOJOs using an older version of KNIME Analytics Platform. MOJOs are not tied to a particular H2O version.
- Node Developers: Custom extensions of class org.knime.core.node.NodeDescription are no longer binary backward compatible to 4.5 and need a recompilation. Also, additional API methods have been added.
- Node Developers: Categories defined by using the category in a NodeSetFactory can no longer be added to locked categories if the category was not defined by the same plugin. Additionally, a DEBUG message is logged if categories are used by a NodeSetFactory but not registered otherwise.