Ensemble Learning

Ensemble Learning

With the KNIME Ensemble Learning Plugin we provide basic methodologies to perform ensemble learning in KNIME. However, some of the nodes are quite useful in other scenarios as well.

Basic functionality: The main idea behind ensemble learning is to combine multiple models into one. As in KNIME we are providing a model in  one port, it was not possible to combine an arbitrary number of models into one. In the ensemble learning plugin it is now possible to save the full information about a model in a single data cell. You can use the Model to Cell (reps. PMML to Cell) node to convert a modelport containing a model into a cell containing the same model. For the backward conversion you can use the Cell to Model (resp. Cell To PMML) node. With this functionality KNIME now provides the possibility to save lists of models.

There are three more general ensemble learning method incorporated in this plugin.

The Bagging nodes first trains a set of models (each only on a subset of the data) afterwards the testdata is predicted with each of the models and finally the voting node detects the majority class.

Bagging Example

The Boosting nodes (Learner and Predictor) apply the AdaBoost.SAMME algorithm to the data set together with the chosen model. Please not that in the bagging as well as the boosting nodes you can exchange the model types, hence the learner and predictor node to any model you like.

The last type of ensemble learning is delegating. In the delegating algorithm a new model is built for all wrong classified data by the previous one. The delegating loop node combination is currently quite extensively used for various use cases. Basically they can be used for all cases where the loop start node needs information from the loop end.

Delegation Example