HDI Hive Spark

This workflow reads CENSUS data from a Hive database in HDInsight; it then moves to Spark where it performs some ETL operations; and finally it trains a Spark decision tree model to predict COW values based on all other attributes. Data for this example come from the new CENSUS dataset which is publicly available and can be downloaded from: http://www.census.gov/programs-surveys/acs/data/pums.html A full explanation of all attributes can be found in: http://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PU…

HDI Hive Spark

 

Resources

EXAMPLES Server: 11_Partners/01_Microsoft/04_HDI_Hive_Spark11_Partners/01_Microsoft/04_HDI_Hive_Spark*
Download a zip-archive

 

 


* Find more about the Examples Server here.
The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher). In other cases, please use the link to a zip-archive or open the provided path manually