What is KNIME Analytics Platform?
KNIME Analytics Platform is an open source software with an intuitive, visual interface that lets you build analyses of any complexity level - from automating spreadsheets to ETL to machine learning.
A more detailed explanation of the different views in the KNIME Workbench is provided in the KNIME Analytics Platform User Guide.
Getting Set Up with KNIME Analytics Platform
If you haven’t downloaded KNIME Analytics Platform yet, you can do so here.
Once you have installed KNIME Analytics Platform, you can start analyzing your data right away. The entry page is the first thing you will see. Here you can access three example workflows to get started, or, if you’re following along with this guide, create your first workflow from scratch.
After you created your first empty workflow, you can start by dragging and dropping your data into the workflow editor to read them in and add nodes from the node repository to build your workflow. Each node performs a specific task and helps you proceed quickly into the manipulation, cleaning, and visualization of your data.
Connect the node’s ports to let the data flow from left to right into your workflow. Drag and drop a connection into an empty area inside the workflow canvas to display the quick node adding panel. Up to twelve nodes to help you build your workflow easier and faster are suggested. Also, you can search in the panel for all compatible nodes. Click the desired node to add it.
Build your First Workflow
You can follow along with this guide by either downloading the workflow and reading the guide to better understand it, or, attempting to build the workflow on your own. Either way, you can access the workflow at any time on KNIME Community Hub.
Now let’s say that you have some data that you want to process, analyze and visualize. With the following example workflow, you will read, combine, clean, and summarize data from multiple Excel sheets. Then you will calculate the total volume of a cargo that carries furniture from one house to another.
To get started, first download the xls file that contains the data that you are going to use in the workflow. Open your KNIME Analytics Platform and create a new, empty workflow by clicking the yellow "plus" button on the entry page.
From the download folder, drag and drop the xlsx file into the workflow editor. An Excel Reader node will appear on the canvas. The node is already configured with the default settings. To open the configuration dialog of the node and inspect the settings, click the configuration cog button in the node action bar.
Here you can see the path to the file you dropped into the workflow editor and a preview of the data table. You can also select the sheet that you want to read the data from. First, read the data in the Kitchen sheet. Click OK and execute the Excel Reader node by clicking the play button in the node action bar. Now the input data are available at the output port of the Excel Reader node. After selecting the node, you can view the output table in the node monitor at the bottom of the workbench
Next, drag and drop a second Excel Reader node from the node repository and configure it in such a way that it reads the same xlsx file but, this time, read in the data from the sheet called Living room.
The data in the second sheet need to be cleaned. In fact, they contain a column named Comment that we want to filter out.
To do so, click the output port of the Excel Reader node and drag the arrow onto the blank workflow canvas. This action opens the quick node insertion panel. Type “Column Filter” into the search bar and click on the Column Filter node, as shown below. The arrow connects the output port of the Excel Reader node with the input port of the Column Filter node. That means that the file processed by the Excel Reader node will be passed on to the subsequent Column Filter node.
To do so, click the output port of the Excel Reader node and drag the arrow onto the blank workflow canvas. This action opens the quick node insertion panel. Type “Column Filter” into the search bar and click on the Column Filter node, as shown below. The arrow connects the output port of the Excel Reader node with the input port of the Column Filter node. That means that the file processed by the Excel Reader node will be passed on to the subsequent Column Filter node.