As an open platform, KNIME is consistently working to give users easy access to all the beloved tools and libraries that they already know and love. To that end, with the latest KNIME 4.7 release, the KNIME Python Integration has been enhanced to make it easier to access the vast range of Python-based visualization libraries–including Matplotlib, Seaborn, Plotly, Altair, and many more.
The new Python View node enables users to create visualizations using these popular libraries as well as any other library that generates visualizations in the form of PNG, SVG, JPEG, or standalone html documents.
Below, find a walkthrough for how to create Python-based visualizations in KNIME and deploy them as browser-based data apps.
Installing the Python Integration
To use a visualization library that is not available in the bundled environment, e.g. the Altair or GGplot library, use the Conda Environment Propagation node to select your custom environment. (Read more about managing Python environments with Conda and KNIME).
Once you’ve installed the integration, you’re ready to start creating Python-based visualizations in KNIME.
Create Static or Interactive Python-based Visualizations in KNIME
The Python View node enables you to create static and interactive visualizations.
Static Python-based Visualizations produce PNG, SVG, JPEG images or even HTML documents. These are typically for simple visualizations to highlight key insights, e.g. year-on-year trends or sales distributions. Users cannot interact with these static visualizations or change any of the chart attributes, such as axes or units.
Interactive Python-based Visualizations share the same “interactivity” properties as native KNIME View nodes and allow the user to interact with the analysis. For example, the user could select a specific region or time range in the first visualization, and this selection is applied in other sets of visualizations. This helps users dive deeper into the analysis to find out what interests them most. They are an easy way to explore and understand analyses that are based on rapidly changing data.
Let’s look at how to create static and interactive visualizations with the Python View node.
1. Create a Static Visualization with the Python View node
You can create static visualizations using a popular library like Matplotlib, Seaborn, and more. These libraries create the visualizations output in the form of a Python object.
In this section, we’ll show you a simple example of how to create a pair plot with the Seaborn library. We’ll visualize the pair plots for the Iris flower dataset – pair plots that can be used to study the Iris flower species distribution across various features.
Open the code editor of the Python View node and assign the python object to the node’s output view using the command:
. Note that Code Block 1 shows the entire Python code for creating Static Visualizations.
The Python View node also provides special view implementations. For example, if you are using the seaborn library, assign the return value of
; or with matplotlib, the return value of
to the node’s output view.
Code block 1 shows a Python snippet inside the Python Script node. First, we import the required libraries – in this case Seaborn – as well as knime.scripting.io which functions as the main contact point between KNIME and Python. The input data is read as pandas dataframe.
Next, we create the pair plot with Seaborn as a Python object. This object is assigned the node’s output view using the command:
After the code block is inserted inside the Python View node, right-click the node and select the option “Execute and Open Views”. This will execute the node and launch the node view as shown in Figure 2.
2. Create Interactive Python-based Visualizations
The Plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, financial, geographic, scientific, and 3-dimensional use cases.
In this next example, we want to use this library to generate an interactive scatter plot of houses, based on a housing dataset. It will visualize the houses by average rooms and average bedrooms and the user will be able to select a time range, by house age.
Code block 2, below, shows the python snippet inside the Python View node’s code editor.
First, we import the essential libraries, in this case Plotly, and also knime.scripting.io which, similar to the previous example, functions as the main contact point between KNIME and Python. The input data is read as pandas dataframe.
Next, we create the scatter plot visualization. We assign the Plotly visualization to the node’s output view using the command:
You can also set the
if you want your plot to synchronize with other plots (see the API Reference for details).
In figure 3, below, you can see the interactive visualization that is created as the node view. The scatter plot shows data points that reflect the range – the “house age” – selected by the user in the legend panel of the scatter plot, to the right of the scatter plot.
Deploy Python-based Visualizations to KNIME Business Hub as Data Apps
You can share your visualizations with others by deploying them as browser-based data apps via KNIME Business Hub.
Build a data app for your visualization by encapsulating the workflow that includes your Python View node (or any other of KNIME’s visualization nodes) into a KNIME component. This produces a so-called “composite” view – a combination of different charts and graphs within a dashboard.
Explore the KNIME Data Apps Beginners Guide for information on how to build your first data app.
In our example, we created a “Predict House Prices” component using the Python View node. Figure 4 shows the workflow inside the Predict House Prices component.
Our Predict House Prices component trains a linear regression model on the house price dataset and then generates a composite view. The view shows 3 visualizations: a map, created by the Python View node, as well as a Scatter plot and a Table View, created using KNIME View nodes.
These visualizations allow you to study the model’s predictions on house prices interactively. For example, you can select a region in the map and this selection will be applied to all the charts and graphs in the composite view. This KNIME workflow “Visualization with Python View node using Plotly library” is publicly available on the KNIME Community Hub.
Users can select the region of interest on the map by dragging the mouse into a box shape and this selection will be propagated across the other KNIME Views in the data app, to visualize model predictions and records in the selected region.
Gain the Flexibility to Work with the Tools You Love
The ability to leverage your favorite Python-based visualizations through the KNIME-Python Integration reinforces our commitment to providing an open system.
With integrations to all the relevant tools and environments within the KNIME ecosystem, you have the flexibility to take the best of the best of the different tools and libraries you love and bring it all together in KNIME.
Explore More Python Resources
- Python Script Space contains simple workflows to get you started with KNIME Python Integration
- Get some examples for using the Python View node in this KNIME Community Hub space: Python view Examples
- Explore the workflow “Visualize with Python View using Matplotlib and Seaborn package” on generating static visualizations, as shown in Figure 2.
- Explore the workflow “Interactive Scatter Plot Visualization with Python View node using Plotly package” for using Plotly with KNIME as shown in Figure 3.
- Explore the “Sharing Component with Interactive Visualization using Python View node with Plotly package” to build aa KNIME component with Python View node as shown in Figure 4.
- To learn more in general about running Python scripts in general from KNIME Analytics Platform check the KNIME Blog “How to Set Up Your Python Extensions”.