This course focuses on how to use KNIME Analytics Platform for in-database processing and writing/loading data into a database. Get an introduction to the Apache Hadoop ecosystem and learn how to write/load data into your big data cluster running on premise or in the cloud on Amazon EMR, Azure HDInsight, Databricks Runtime or Google Dataproc. Learn about the KNIME Spark Executor, preprocessing with Spark, machine learning with Spark, and how to export data back into KNIME/your big data cluster.
This course lets you put everything you’ve learnt into practice in a hands-on session based on the use case: Eliminating missing values by predicting their values based on other attributes.
This is an instructor-led course consisting of four, 75-minutes online sessions run by one of our KNIME data scientists. Each session has an exercise for you to complete at home and together, we will go through the solution at the start of the following session. The course concludes with a 15 to 30 minute wrap up session.
- Session 1: Introduction to KNIME and the Database Extension
- Session 2: Data Processing in a Traditional Database
- Session 3: Working with Hadoop and Spark
- Session 4: Machine Learning & Summary
You should be an advanced KNIME user and ideally have already built some workflows. This course doesn’t provide an introduction to KNIME Analytics Platform - it focuses on more managing big data with KNIME Analytics Platform.
You’ll receive a zoom link with your registration confirmation. Make sure you have a stable internet connection!
Sure! The sessions will be recorded and you’ll have access to each one for seven days from the time the session is over.
Absolutely - fire away!
Your own laptop, ideally pre-installed with the latest version of KNIME Analytics Platform, which you can download at knime.com/downloads.
Download the latest free, open source version of knime here: knime.com/download