This workflow uses a portion of the Irish Energy Meter dataset, and presents a simple analysis based on the whitepaper "Big Data, Smart Energy, and Predictive Analytics". It is intended to highlight KNIME's Big Data and Spark functionality in the 3.6 release. The workflow creates a Local Big Data Environment, loads the meter dataset to Hive, and then transfers it into Spark. It uses a series of Spark SQL nodes to create datetime fields, and then uses Spark nodes to aggregate energy usage over these datetime fields. In the wrapped metanode, it performs PCA and k-means using Spark nodes, and does some simple visualizations of the clustered data. Finally, it writes the clustered data out to both Hive and Parquet formats.
EXAMPLES Server: 10_Big_Data/02_Spark_Executor/09_Big_Data_Irish_Meter_on_Spark_only10_Big_Data/02_Spark_Executor/09_Big_Data_Irish_Meter_on_Spark_only*
Download a zip-archive
* Find more about the Examples Server here.
The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher). In other cases, please use the link to a zip-archive or open the provided path manually