Join: inner join, right outer join, left outer join, full outer join

In this section we describe the basic concept of join - with its declinations of inner join, left outer join, right outer join, and full outer join - and the way it is implemented in KNIME with the Joiner node.

Indeed, the join mode is not the only parameter to set for a join. You also need to identify the key column(s), the columns you want to keep into the output table, and a number of strategies to deal with duplicate values.

 

 

 

Reference workflow is on the EXAMPLES server under:
02_ETL_Data_Manipulation/03_Joining_and_Concatenating/02_Joiner02_ETL_Data_Manipulation/03_Joining_and_Concatenating/02_Joiner*

Exercise

  • Read adult.csv data set. Then calculate the average age and number of rows for the 4 groups defined by (sex, income) and join the corresponding 2 aggregated values to each row in the group.

 

Solution

 


(click on the image to see it in full size)

A possible solution can be found inside the workflow on EXAMPLES Server:
02_ETL_Data_Manipulation/03_Joining_and_Concatenating/03_JoinConcatenate_Examples02_ETL_Data_Manipulation/03_Joining_and_Concatenating/03_JoinConcatenate_Examples*

 

 


* The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher)