Spark Executor

Spark MLlib Decision Tree

This workflow demonstrates the usage of the Spark MLlib Decision Tree Learner and Spark Predictor. It also demonstrates the conversion of categorical columns into numerical columns which is necessary since the MLlib algorithms only support numerical features and labels.

Spark MLlib Decision Tree


Modularized Spark Scripting

This workflow demonstrates the usage of the different Spark Java Snippet nodes to read a text file from HDFS, parse it, filter it and write the result back to HDFS.
You might also want to have a look at the provided snippet templates that each of the node provides. In order to do so simply open the configuration dialog of a Spark Java Snippet node and go to the Templates tab.

