Advanced Row Filter

Sometimes one rule only is not enough for row filtering. Sometimes we need a combination of rules or a more complex rule to perform the required row filter. Sometimes we need to operate on data types that are different from the usual Integer, Double, and String. In the following videos, we show how to implement a complex row filtering rule and how to operate on geographical coordinates or time windows.

 

 

Reference workflow is on the EXAMPLES server under:
02_ETL_Data_Manipulation/01_Filtering/04_Advanced_Row_Filters02_ETL_Data_Manipulation/01_Filtering/04_Advanced_Row_Filters*

Exercise

Read adult.csv data set. Then, exclude rows where marital-status is missing. On the remaining rows:

  • extract rows where marital-status = "Divorced" OR "Separated" using
    • a Nominal Value Row Filter node
    • a Reference Row Filter node
  • extract rows where marital-status = "Never-married" AND 20 <= age <= 40 using a Rule-based Row Filter node

 

Solution
  • Table with only rows where marital-status = “Divorced”  OR “Separated” contains 5468 rows, when extracted using both a Nominal Value Row Filter node and a Reference Row Filter node
  • Rows where marital-status = "Never-married" AND 20 <= age <= 40 are 7080.

A possible solution can be found inside the workflow on EXAMPLES Server:
02_ETL_Data_Manipulation/01_Filtering/05_More_Row_Filter_Examples02_ETL_Data_Manipulation/01_Filtering/05_More_Row_Filter_Examples*

 

 


* The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher)