Let’s start this post with a question. How many different algorithms do you know that can solve classification problems? There are lots! Decision Tree, Random Forest, Deep Learning, Logistic Regression, just to name a few options. How to choose? It is hard to say in advance.
Traditionally, a handful of models is trained and then compared to choose the one that performs best. More recently, this handful of models is trained and the models in it are forced to work together to give the best prediction.
A handful of models, as we informally said, is normally referred to as a bag of models, which can be trained with KNIME native nodes but also with R, Weka, Python, H2O, Java, etc ….
As far as model selection goes, again, there are different metrics to compare model performances: accuracy, confusion matrix, Cohen’s Kappa, A/B Test and ROC Curve, just to name a few.
To provide some orientation in this variety, this video proposes a review of the analytics techniques available in KNIME Analytics Platform.