Filtering

Column Filter

3 filtering modes: manually, by tape, by name. - manually you decide which column to keep and which to let go, through Add and Remove buttons. - by type you decide the columns to keep based on their type, like all Strings or all Integers. - by name you decide which columns to keep based on their name through wildcards and Reg Ex

Column Filter

 

Row Filtering

3 matching criteria on data colums: on String by full or partial pattern matching, on numbers by range, on missing values, all of them also on collection columns. 1 matching criterion on row numbers: from row number to row number. 1 matching criterion on RowID: full and partial patterm matching. Partial pattern matching is obtained through wild cards and RegEx. All matching criteria can be used in Include or Exclude mode. Include keeps the match results. Exclude excludes it.

Advanced Row Filters

Not only simple filtering with the Row Filter node, but also: filtering according to more complex rules with Nominal Value Row Filter, Rule.based Row Filter, Java Snippet Row Filter, Reference Row Filter node; filtering on geographical coordinates with Geo-coordinate Row Filter node; filtering on a time window with Extract Time Window node; in-database row filtering with Database Row Filter node.

More Row Filter Examples

On adult.csv data set: exclude rows where marital-status is missing. On the remaining rows: a. extract rows where marital-status = "Divorced"; b. extract rows where marital-status = "Divorced" OR "Separated" using a Nominal Value Row Filter node; c. extract rows where marital-status = "Divorced" OR "Separated" using a Reference Row Filter node; d. extract rows where marital-status = "Never-married" AND 20

Subscribe to Filtering