Machine learning models can automate different kinds of processes → prove customer credit worthiness, flag emails as spam, detect fraudulent transactions, forecast weather, optimize the electricity supply, and more! The overarching goal of all these applications is to have accurate predictions. But how do we define “accurate”?
Alfredo Roccato and Maarit Widmann have co-authored a series of articles that discuss quantifying accuracy with different model evaluation metrics - or scoring metrics - in the model building process. Model evaluation metrics not only provide objective values to optimize model parameters and compare models, they also reveal the expected accuracy of the model in production data.
Alfredo Roccato is an independent consultant and trainer for data science. He works with companies on Business Intelligence and Business Analytics projects. He also writes regularly for the KNIME Blog.
Maarit Widmann is a data scientist on the Evangelism team at KNIME. She is the author behind the KNIME self-paced courses and co-author of the time series analysis courses. She writes for the KNIME Blog, Dataversity, and, most recently, InfoSecurity.
The collection of articles about model evaluation has now been reviewed and published as an ebook. Download the book From Modeling to Model Evaluation free of charge from KNIME Press.
Overview of Contents: Confusion Matrix, Cohen’s Kappa Interpretation, Resampling, and more!
In this booklet, we provide an overview of scoring metrics to evaluate a classification and regression model in eight articles. We begin with an introduction to the confusion matrix and class statistics (chpt. 1), numeric scoring metrics (chpt. 2) and visual scoring techniques (chpt. 3) We then take a look at a special situation that occurs frequently in practice, namely, imbalanced target classes. We apply resampling to imbalanced data and correct the prediction results for bias (chpt. 4), we explain why Cohen’s kappa works better than overall accuracy for imbalanced data (chpt. 5), and we investigate why resampling might fail for highly imbalanced data (chpt. 6). Finally, we take a look at a few examples of interpreting the model results. Here, we check how much profit a credit scoring model generates in terms of money (chpt. 7) and how we can interpret the coefficients of a logistic regression model as percentage effects (chpt. 8).
We wish you insightful reading!