The goal of the workflow is to identify outliers in the medical claim data such as claims with an unusual high cost for a certain disease In order to find these outliers the input data is group by the target variable (disease) and the mean and standard deviation is computed for the numerical variable in question (cost of stay). Outliers are all records that deviate more than x*standard deviation from the mean value of the group they belong to whereas the factor x e.g. 2 is specified by the analyst. The upper branch of the workflows allows such an analysis and allows the user to change the group and aggregation column via the meta node context menu.
The lower branch of the workflow is a refinement of this approach and allows to identify outliers across several variables e.g. claims with an unusual high cost for a certain disease and duration of days staid. To achieve this the user has to select two group such as disease and duration of stay.
The workflow analyses the Basic Stand Alone (BSA) Inpatient Public Use Files (PUF) named “CMS 2008 BSA Inpatient Claims PUF” with information from 2008 Medicare inpatient claims. This is a claim-level file in which each record is an inpatient claim incurred by a 5% sample of Medicare beneficiaries. There are some demographic and claim-related variables provided in this PUF as detailed below. However, as beneficiary identities are not provided, it is not possible to link claims that belong to the same beneficiary in the CMS 2008 BSA Inpatient Claims PUF.
EXAMPLES Server: 50_Applications/14_Medical_Claims/01_Interactive_Outlier_Detection50_Applications/14_Medical_Claims/01_Interactive_Outlier_Detection*
Download a zip-archive
* Find more about the Examples Server here.
The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher). In other cases, please use the link to a zip-archive or open the provided path manually