Interactive Outlier Detection

The goal of the workflow is to identify outliers in the medical claim data such as claims with an unusual high cost for a certain disease. In order to find these outliers the input data is group by the target variable (disease) and the interquartile range (IQR), i.e. the difference between the 3rd and 1st quartile, is computed for the numerical variable in question (cost of stay). Outliers are all records that do not lie inside the permitted interval defined by [1st quartile - x * IQR, 3rd quartile + x * iQR] where factor x e.g. 1.5 is specified by the analyst.

