Big Data Extensions

KNIME Big Data Extensions are now open source and included in KNIME Analytics Platform.


KNIME Big Data Extensions integrate the power of Apache Hadoop and Apache Spark with KNIME Analytics Platform and KNIME Server. Our software takes the confusion out of big data by making it accessible within our familiar analytics environment. These extensions consist of two complementary node libraries:

  • KNIME Big Data Connectors enable you to import/export HDFS data and perform SQL analytics within Hive and Impala.
  • KNIME Extension for Apache Spark enables you to create and run Apache Spark applications from within KNIME Analytics Platform or KNIME Server, unleashing the power of scalable analytics. Read/write data in HDFS, Hive, and Impala from within Apache Spark.

The workflow shown in this video is located on the EXAMPLES server under: 50_Applications/28_Predicting_Departure_Delays/02_Scaling_Analytics_w_BigData50_Applications/28_Predicting_Departure_Delays/02_Scaling_Analytics_w_BigData*

Unleash the Power of Hadoop

Migrating your analytics to big data is a matter of swapping a few nodes in existing workflows. KNIME Big Data Extensions bring you into the Hadoop ecosystem with support for enterprise-grade, industry-leading Hadoop distributions. Query Hive data and apply advanced analytics in Apache Spark within a single, visual KNIME workflow and make Hadoop accessible without coding.

A Powerful Combination

KNIME Big Data Extensions bring a familiar, easy-to-use graphical approach to big data problems. These libraries blend the power of KNIME Analytics Platform with Hadoop to expand the advantages of both.

  • SQL-style big data querying
  • Sophisticated data mining
  • Advanced predictive analytics
  • In-memory processing
  • Extensive additional functionality

Advanced Features

  • Connect to popular Hadoop distributions
  • Integrate Apache Spark with more than 2000 native KNIME nodes using familiar KNIME workflows
  • Mix & match remote and distributed computing as needed
  • Import predictive models into Apache Spark with PMML models generated from KNIME workflows
  • Enable a popular suite of machine learning algorithms via MLlib integration

     


    * The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher)