Useful white papers from KNIME.
This whitepaper addresses these exact two problems:
- Create a Customer Segmentation analytics heart
- Create a Web User Interface to inject business experts' knowledge into the final results
Workflow is available on the EXAMPLES Server under 50_Applications/24_Customer_Segmentation_UseCase
This whitepaper applies some of the techniques for dimensionality reduction, described in another KNIME whitepaper “Seven Techniques for Data Dimensionality Reduction”. It does not only apply them. It also adds an appealing, intuitive, step guided web interface through the KNIME WebPortal. This makes the workflow a useful instrument not only in the hands of domain experts, but data analysis beginners, and users, too.
Workflow is available on the EXAMPLES Server under 50_Applications/25_DataCleaning_WebPortal
The list of KNIME use cases has a new addition: how to make a bot, more specifically, how to make a teacher bot.
This bot is charged with the task of helping people newbies to find answers to their initial questions about KNIME from a large number of technical tutorials.
There is a series of steps involved in building a bot. First of all, a web based user interface to ask the question; next, a set of text processing NLP functions to parse it; and a machine learning model to associate the right group of tutorials to the question that was asked.
For this particular project, we also decided to adopt an active learning strategy to use an unlabeled dataset to train the model.
Anomaly Detection
Here we show how we prepared and visualized FFT-transformed sensor data from a rotor equipment: frequency binning, time alignment, and visualization.
- Download pdf
- Workflow "Time Alignment and Visualization" is available on the KNIME EXAMPLES server under 050_Applications/050017_AnomalyDetection/Pre-processing
- 050_Applications/050017_AnomalyDetection/data also contains a reduced version of the original data set
- Download full data set
This second whitepaper of the anomaly detection series approaches the prediction of the “unknown” and possibly catastrophic event from a time series perspective. Chart Control and Auto-Regressive models are used to trigger alarms when the underlying system starts wandering off the known working condition.
- Download pdf
- Workflow and data can be found on the KNIME EXAMPLES server under 50_Applications/17_AnomalyDetection/Time Series Analysis
Energy
In this whitepaper we show step-by-step how to integrate a big data platform into a KNIME workflow, using dedicated and/or generic connector nodes to connect to big data platforms and SQL helper nodes. Example workflow can be found on the EXAMPLES server under 004_Database/004005_Energy_Prepare_Data (Big Data).
This whitepaper focuses on smart energy data from the Irish Smart Energy Trials. The first goal is to identify a few groups with common electricity behavior to create customized contract offers. The second goal is a reliable prediction of the overall energy consumption using time series prediction techniques.
This whitepaper covers all steps to extract knowledge from a web forum:crawls the forum and downloads the data, calculates some simple statistics, detects the discussed topics, and shows the experts for each topic.
This whitepaper shows an example of how advanced analytics combined with real-time execution can provide an end to end solution from model development to operational deployment and real time execution within any business process.
This whitepaper extracts IP addresses from a web log file and transforms them to points on a world map, producing a report with images and movie of the daily IP addresses. Example workflows are available on the EXAMPLES Server under 008_WebAnalytics_and_OpenStreetMap.
This whitepaper explains how data scientists can take a proactive approach to GDPR with KNIME Analytics Platform. It highlights some of the best practices to implement to ensure customer data can be used in a way that is not only compliant, but also useful for business.
Exploring and comparing seven different dimensionality reduction techniques: Missing Values, Low Variance Filter, High Correlation Filter, PCA, Random Forests, Backward feature Elimination, Forward feature Construction.
Here we provide a few experience based guidelines about the DWH infrastructure needed for a data science lab: production, development, testing, and fall-back, environment segregation, customized data sets, role management, user permissions, resource sharing, user authentication, rollover, versioning, dashboards, and many more features needed to build the infrastructure of a modern data science lab.
In this white paper, we present and full document the KNIME Model Factory, designed to provide you with a flexible, extensible and scalable application for running very large numbers of model processes in an efficient way. The KNIME Model Factory is composed of an overall workflow, tables that manage all activates and a series of workflows and data for learning – all available via the KNIME public exampleserver.
With KNIME Analytics Platform you are spoiled for choice in creating great data science. You can of course build everything using KNIME nodes, but other choices exist. Want a workflow that uses available in-DB capabilities then moves to a production Apache Spark setup - while at the same time using a special Google service before comparing a KNIME Random Forest to an H2O Random Forest, automatically choosing the correct model, which then gets applied to your favourite CRM so that the new score is placed back into the CRM? No problem in KNIME. This article provides background around choice and tuning options, as well as an approach and sample workflows for determining the “right” combination for your specific requirements.
KNIME firmly believes in open source and the power of the community. Our philosophy is to maintain and develop an open source platform containing all the functionality that any individual might require and to continue delivering extended functionality through our own work and that of the KNIME community. KNIME complements the open source KNIME Analytics Platform with licensed commercial software for increasing productivity and enabling collaboration. To support our development goals, KNIME has implemented rigorous software engineering standards, a key part of these standards focuses on ensuring all aspects of software security. Our security policy includes a structured approach for exchange and the communication of security related topics with our extensive user community. The extensive capabilities and choices available with KNIME Software means that each organization will want to implement KNIME Software capabilities based on their business as well as security requirements. To assist, KNIME provides software capabilities and best practices from our customers so that an organization can implement and test their own security compliance regime on their usage of KNIME Software.