Column Filter

Often, in a data set, not all data columns carry information or are filled with useful values. In case, we want to get rid of unwanted columns, the ETL operation to run is column filtering. In this small section we explain how to remove unwanted columns, either manually or automatically based on some previous knowledge.

 

 

Reference workflow is on the EXAMPLES server under:
02_ETL_Data_Manipulation/01_Filtering/02_Column_Filter02_ETL_Data_Manipulation/01_Filtering/02_Column_Filter*

Exercise

Read adult.csv data set. Then:

  • remove column "marital-status
  • keep only column "marital-status"
  • keep only String columns using a Column Filter node and then only column "marital-status" using a Reference Column Filter node

 

Solution
  • Table without column “marital-status”
  • Table with only one column “marital-status”
  • Table with only String columns has 9 columns. Final table has only one column: “marital-status”.

A possible solution can be found inside the workflow on EXAMPLES Server:
02_ETL_Data_Manipulation/01_Filtering/06_More_Column_Filter_Examples02_ETL_Data_Manipulation/01_Filtering/06_More_Column_Filter_Examples*

 

 


* The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher)