Techniques Outlier Detection

We use a sample of the airline data to detect outlier airports based on the average arrival delay in them. The techniques we apply are numeric outlier, z-score, DBSCAN and isolation forest. Outliers detected by each of these techniques are visualized on a map of US using the KNIME OSM integration.

Techniques Outlier Detection

Column Filter

3 filtering modes: manually, by tape, by name. - manually you decide which column to keep and which to let go, through Add and Remove buttons. - by type you decide the columns to keep based on their type, like all Strings or all Integers. - by name you decide which columns to keep based on their name through wildcards and Reg Ex

Advanced Row Filters

Not only simple filtering with the Row Filter node, but also: filtering according to more complex rules with Nominal Value Row Filter, Rule.based Row Filter, Java Snippet Row Filter, Reference Row Filter node; filtering on geographical coordinates with Geo-coordinate Row Filter node; filtering on a time window with Extract Time Window node; in-database row filtering with Database Row Filter node.

More Row Filter Examples

On adult.csv data set: exclude rows where marital-status is missing. On the remaining rows: a. extract rows where marital-status = "Divorced"; b. extract rows where marital-status = "Divorced" OR "Separated" using a Nominal Value Row Filter node; c. extract rows where marital-status = "Divorced" OR "Separated" using a Reference Row Filter node; d. extract rows where marital-status = "Never-married" AND 20

Row Filtering

3 matching criteria on data colums: on String by full or partial pattern matching, on numbers by range, on missing values, all of them also on collection columns. 1 matching criterion on row numbers: from row number to row number. 1 matching criterion on RowID: full and partial patterm matching. Partial pattern matching is obtained through wild cards and RegEx. All matching criteria can be used in Include or Exclude mode. Include keeps the match results. Exclude excludes it.

Subscribe to Filtering