This workflow trains a number of data analytics models on Hadoop and Spark and automatically selects the best model to predict departure delays from a selected airport. Data is the airline dataset downloadable from: http://stat-computing.org/dataexpo/2009/the-data.html. Departure delay is a delay > 15min. Default selected airport is ORD. This workflow implements data reading, data blending, ETL, guided analytics, dimensionality reduction, advanced data mining models, model selection using: Hadoop, Spark, in-memory, parallelization, grid computing, multithreading and/or in-database to speed up computationally intensive operations. Data available in knime://knime.workflow/data/1_Input
EXAMPLES Server: 50_Applications/28_Predicting_Departure_Delays/02_Scaling_Analytics_w_BigData50_Applications/28_Predicting_Departure_Delays/02_Scaling_Analytics_w_BigData*
Download a zip-archive
* Find more about the Examples Server here.
The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher). In other cases, please use the link to a zip-archive or open the provided path manually