White Papers

Useful white papers from KNIME.

Guided Analytics
Customer Segmentation comfortably from a Web Browser. Combining Data Science and Business Expertise (2016)

This whitepaper addresses these exact two problems:

  • Create a Customer Segmentation analytics heart
  • Create a Web User Interface to inject business experts' knowledge into the final results

Workflow is available on the EXAMPLES Server under 50_Applications/24_Customer_Segmentation_UseCase

Guided Data Cleaning through a Web Interface. Data Cleaning with Guided Analytics (2016)

This whitepaper applies some of the techniques for dimensionality reduction, described in another KNIME whitepaper “Seven Techniques for Data Dimensionality Reduction”. It does not only apply them. It also adds an appealing, intuitive, step guided web interface through the KNIME WebPortal. This makes the workflow a useful instrument not only in the hands of domain experts, but data analysis beginners, and users, too.

Workflow is available on the EXAMPLES Server under 50_Applications/25_DataCleaning_WebPortal

Emil, the Teacher Bot. WebPortal GUI, Text Processing, Machine Learning, and Active Learning (2018)

The list of KNIME use cases has a new addition: how to make a bot, more specifically, how to make a teacher bot.

This bot is charged with the task of helping people newbies to find answers to their initial questions about KNIME from a large number of technical tutorials.

There is a series of steps involved in building a bot. First of all, a web based user interface to ask the question; next, a set of text processing NLP functions to parse it; and a machine learning model to associate the right group of tutorials to the question that was asked.

For this particular project, we also decided to adopt an active learning strategy to use an unlabeled dataset to train the model.

Download now

IoT

Anomaly Detection

Anomaly Detection I: Time Alignment and Visualization for Anomaly Detection (2015)

Here we show how we prepared and visualized FFT-transformed sensor data from a rotor equipment: frequency binning, time alignment, and visualization.

  • Download pdf
  • Workflow "Time Alignment and Visualization" is available on the KNIME EXAMPLES server under 050_Applications/050017_AnomalyDetection/Pre-processing
  • 050_Applications/050017_AnomalyDetection/data also contains a reduced version of the original data set
  • Download full data set
Anomaly Detection II: Anomaly Detection in Predictive Maintenance with Time Series Analysis (2015)

This second whitepaper of the anomaly detection series approaches the prediction of the “unknown” and possibly catastrophic event from a time series perspective. Chart Control and Auto-Regressive models are used to trigger alarms when the underlying system starts wandering off the known working condition.

  • Download pdf
  • Workflow and data can be found on the KNIME EXAMPLES server under 050_Applications/050017_AnomalyDetection/Time Series Analysis

Energy

KNIME opens the Doors to Big Data. A practical Example of integrating any Big Data Platform into KNIME (2015)

In this whitepaper we show step-by-step how to integrate a big data platform into a KNIME workflow, using dedicated and/or generic connector nodes to connect to big data platforms and SQL helper nodes. Example workflow can be found on the EXAMPLES server under 004_Database/004005_Energy_Prepare_Data (Big Data).

Big data, Smart Energy, and Predictive Analytics (2013)

This whitepaper focuses on smart energy data from the Irish Smart Energy Trials. The first goal is to identify a few groups with common electricity behavior to create customized contract offers. The second goal is a reliable prediction of the overall energy consumption using time series prediction techniques.

Cities

Taming the Internet of Things with KNIME: Data Enrichment, Visualization, Time Series Analysis, and Optimization (2014)

This paper describes a number of techniques for data enrichment through responses from external RESTful services-analytics, model optimization, and visualization - from R graphic libraries to geo-localization with Open Street Maps and network visualization.

Social Media
Analyzing the Web from Start to Finish: Knowledge Extraction from a Web Forum using KNIME (2013)

This whitepaper covers all steps to extract knowledge from a web forum:crawls the forum and downloads the data, calculates some simple statistics, detects the discussed topics, and shows the experts for each topic.

Usable Customer Intelligence from Social Media Data: Network Analytics meets Text Mining (2012)

Text mining and network analytics are combined here to better position negative and positive users in context with their weight as influencers or followers inside the discussion forum.

Social Media, Recommendation Engines and Real-Time Model Execution with KNIME and ADAPA (2011)

This whitepaper shows an example of how advanced analytics combined with real-time execution can provide an end to end solution from model development to operational deployment and real time execution within any business process.

Web Analytics
Geolocalization of KNIME Downloads as a static Report and as a Movie (2014)

This whitepaper extracts IP addresses from a web log file and transforms them to points on a world map, producing a report with images and movie of the daily IP addresses. Example workflows are available on the EXAMPLES Server under 008_WebAnalytics_and_OpenStreetMap.

ETL - Pre-Processing
Taking a Proactive Approach to GDPR with KNIME (2018)

This whitepaper explains how data scientists can take a proactive approach to GDPR with KNIME Analytics Platform. It highlights some of the best practices to implement to ensure customer data can be used in a way that is not only compliant, but also useful for business.

Seven Techniques for Data Dimensionality Reduction (2015)

Exploring and comparing seven different dimensionality reduction techniques: Missing Values, Low Variance Filter, High Correlation Filter, PCA, Random Forests, Backward feature Elimination, Forward feature Construction.

IT
Data and Machine Architecture for the Data Science Lab (2015)

Here we provide a few experience based guidelines about the DWH infrastructure needed for a data science lab: production, development, testing, and fall-back, environment segregation, customized data sets, role management, user permissions, resource sharing, user authentication, rollover, versioning, dashboards, and many more features needed to build the infrastructure of a modern data science lab.

Model Process Factory
Model Process Factory (2017)

In this white paper, we present and full document the KNIME Model Factory, designed to provide you with a flexible, extensible and scalable application for running very large numbers of model processes in an efficient way. The KNIME Model Factory is composed of an overall workflow, tables that manage all activates and a series of workflows and data for learning – all available via the KNIME public exampleserver.