Outlier Techniques for Credit Card Fraud Detection

May 20, 2021 — by Maarit Widmann &  Rosaria Silipo

Based on an August 2020 report by Interpol, more people have been spending time online since the start of the coronavirus pandemic, which has resulted in increased cybercrime. UK Finance also claimed remote banking fraud losses soared by 21% to reach 80 million pounds during the first half of 2020. Online fraud is on the rise.

ING reported the following in their 2019 Annual Report: “Technology can help us deal with these risks by improving customer due diligence processes and the prevention, detection, quality and speed of response to financial economic crime. For example, we developed a virtual alert handler that uses artificial intelligence to better detect suspicious transactions and customer behaviors, … An AI-based anomaly detection tool went live in September, which is used to uncover suspicious transactions in the clearing and settlement process between banks.”

Criminals are persistently finding new ways to circumvent measures put in place to prevent fraud, making their activities difficult to detect. Thus, it is difficult for advanced machine learning models to be trained from available data because fraudulent patterns are not only scarce but also change rapidly. Below, we explore how data science can be used to identify credit card fraud.

Our case study draws on the Credit Card Fraud Detection Kaggle dataset which contains more than 284,000 credit card transactions performed by EU cardholders in September 2013. Each transaction is described by 30 features: 28 principal components pulled from the original data, the transaction amount, and the date. The 28 principal components hide sensitive cardholder data. Each transaction is assigned a class label: legitimate (0) and fraudulent (1).

Most of the transactions in the dataset are legitimate; only a very small portion — 492, or 0.2% — are fraudulent. This disproportion does not allow us to train supervised machine learning models successfully, given the limited number of available examples for the fraud class. If we consider that most datasets with credit card transactions are unlabeled or that fraudulent transactions cannot be reliably manually identified, the chances of applying a supervised machine learning model successfully dwindle further. This is where outlier detection techniques can be useful. Here, we experiment with several different outlier detection techniques.

  • Quantile-based: Box plot
  • Distribution-based: Z-score
  • Cluster-based: DBSCAN
  • Neural autoencoder
  • Isolation forest

Of the 492 fraudulent transactions, we used 80 in a validation set to optimize the parameters involved in the techniques, such as thresholds, and 20 as part of a test set. A corresponding number of the more numerous legitimate transactions was added to both validation and test sets. Performances, in terms of Recall and Precision on the test set, as shown below in the graphic.

Outlier Techniques for Credit Card Fraud Detection
Fig. 1. Recall and Precision measured on the test set for the outlier detection techniques described above.

The number of false positives is incredibly high for the first two techniques, box plot and z-score, as seen from their Precision percentage. This means that to detect some 60% of fraudulent actions, most transactions are labeled as fraud alarms. Not very useful!

Download the workflow, Overview of Credit Card Fraud Detection Techniques, from the KNIME Hub.  It demonstrates different fraud detection techniques and evaluates the performance of each one on the same test set, reported in terms of Recall and Precision.

This workflow is stored in the Finance, Accounting, and Audit space on the KNIME Hub, which contains many more solution starters for you to get an idea about the capabilities KNIME provides in these fields.

See how ING is also using KNIME Software for Internal Audit 

Producing fewer false positives yet discovering at least half of the frauds, the DBSCAN, autoencoder, and isolation forest techniques are slightly more discriminant. The neural autoencoder performed quite well — after we realized that ReLU activation functions are not recommended for autoencoder hidden units, but rather sigmoid activation functions should be preferred. The DBSCAN algorithm is sensitive to its configuration but also performed well after parameter optimization.

Note: The problem of false positives will never disappear since an outlier detection technique is designed to detect what is unusual. They will frequently get mixed up with legitimate transactions unless you have a very separate pattern for fraudulent transactions — one that rarely changes. Also remember that fraudsters are doing their very best to make their fraudulent transactions look as much like legitimate transactions as possible. So, when applying these strategies, we must be prepared to face a fair number of false positives, i.e., fraud alarms that lead to nothing.

In conclusion, we could measure the performance of all implemented outlier detection techniques, in terms of recall and precision, using the few credit card transactions in the dataset labeled as fraudulent. The best compromise between number of frauds discovered (recall) and number of true alarms (precision) came from the neural autoencoder. Because of the very nature of an outlier event and, therefore, of an outlier detection technique, none of the strategies were immune from the false positives. At times, you have to work with the data that’s available: Lacking a labeled dataset, an outlier detection technique may be the best data science choice.

This is a revised version of the article, first published in InfoSecurity Magazine.

You Might Also Like

How to Detect Outliers | Top Techniques and Methods

Ever been skewed by the presence of outliers in your set of data? Anomalies, or outliers, can be a serious issue when training machine learning algorithms or...

October 1, 2018 – by Maarit Widmann &  Moritz Heine

What are you looking for?