Predictive Maintenance Examples and The Challenge of Predicting the Unknown
An intrusion in network data, a sudden pathological status in medicine, a fraudulent payment in sales or credit card businesses, and, finally, a mechanical piece breakdown in machinery are all examples of unknown and often undesirable events, deviations from the “normal” behavior.
Predicting the unknown in different kinds of IoT data is well established and high value, in terms of money, life expectancy, and/or time, is usually associated with the early discovery. Yet it comes with challenges! In most cases, the available data are non-labeled, so we don’t know if the past signals were anomalous or normal. Therefore, we can only apply unsupervised models that predict unknown disruptive events based on the normal functioning only.
In the field of mechanical maintenance this is called “anomaly detection”. There is a lot of data that lends itself to unsupervised anomaly detection use cases: turbines, rotors, chemical reactions, medical signals, spectroscopy, and so on. In our case here, we deal with rotor data.
The goal of this “Anomaly Detection in Predictive Maintenance” series is to be able to predict a breakdown episode without any previous examples.
Today we want to build a simple prediction model: a control chart. Our analysis builds on the first part of the series, where we standardized and time aligned the FFT preprocessed sensor data, and explored its visual patterns. In the third part we’ll implement an auto-regressive model.
Today’s Approach: The Control Chart
A Control Chart defines the normal functioning of a process. It is a common statistical tool to determine if the variation in the process is a part of the process itself, or caused by some external factor. In our case the external factor could be a deteriorating rotor. In its simplest form, a control chart consists of a line plot of the process itself, the process average, and the upper and lower limits to the normal process behavior (Figure 1).
We define the control chart for each frequency band and sensor separately. Furthermore, we define the normal functioning as the cumulative moving average of the signal +/- 2 times the cumulative standard deviation. If the signal is wandering off from this normal area, an alarm should occur. Cumulative means that the measure is calculated on all values of the time series prior to the current value. So, a cumulative sum is the sum of past values up to the current value, a cumulative average is the average calculated on all past values up to the current value, and so on.
The Anomaly Detection. Control Chart workflow shown in Figure 2 implements the procedure. This predictive maintenance example workflow is publicly available to download from the KNIME Hub.
We start by accessing the data, which was preprocessed in the first part of the series. The preprocessed data contain 313 spectral amplitude columns originating from 28 sensors that monitor 8 mechanical pieces of a rotor. Next, we replace the missing values in the data by the last available values.
Inside the “1st level alarm” metanode, we loop over the 313 spectral amplitude columns, define the control chart within each column, flag the offsets as 1st level alarms, and finally collect the 1st level alarms column-wise. Inside the “2nd level metanode”, we aggregate the 1st level alarms column-wise, and flag the aggregated values as 2nd level alarms, if they exceed a threshold (0.25). Finally, we visualize the 2nd level alarms over time and send an email to the person in charge of mechanical checkups if a 2nd level alarm is active.