Predict DepartureDelays with MicrosoftR

Here we train a model to predict flight delays using 3 different approaches. - Upper branch: full MIcrosoft R (pure R script) - Middle branch: hybrid KNIME & Microsoft R - Lower branch: full KNIME (taking the data from Microsoft R). This workflow and the integration of Microsoft R in KNIME Analytics Platform are described in this YouTube video: https://youtu.be/HtlpvydyZD0 . To use Microsoft R in KNIME, just change the path to R.exe in Preferences -> KNIME -> R to point to your Microsoft R installation

HDI Hive KNIME

This workflow reads CENSUS data from a Hive database in HDInsight; it then performs some In-Database Processing on Hive; and finally it trains a KNIME decision tree model to predict COW values based on all other attributes.

SQL Server InDB Processing(Azure)

This workflow shows how to perform in-database processing on Micrsoft SQL Server. It starts with a dedicated SQL Server Connector node. It then shows column filtering, row filtering, aggregation, joining, and sorting. Many more in-database processing operations are possible either via nodes with graphical UI such as the ones below or via the Database Query node. The Database Query node in fact allows for custom manual SQL queries. The Database Connection Table Reader node at the end of most branches here passes the data into KNIME Analytics Platform.

HDI Hive Spark

This workflow reads CENSUS data from a Hive database in HDInsight; it then moves to Spark where it performs some ETL operations; and finally it trains a Spark decision tree model to predict COW values based on all other attributes.

Subscribe to Microsoft

What are you looking for?