14 Aug 2017rs

Do you remember the Iron Chef battles

It was a televised series of cook-offs in which famous chefs rolled up their sleeves to compete in making the perfect dish. Based on a set theme, this involved using all their experience, creativity, and imagination to transform sometimes questionable ingredients into the ultimate meal.

Hey, isn’t that just like data transformation? Or data blending, or data manipulation, or ETL, or whatever new name is trending now? In this new blog series requested by popular vote, we will ask two data chefs to use all their knowledge and creativity to compete in extracting a given data set's most useful “flavors” via reductions, aggregations, measures, KPIs, and coordinate transformations. Delicious!

Want to find out how to prepare the ingredients for a delicious data dish by aggregating financial transactions, filtering out uninformative features or extracting the essence of the customer journey? Follow us here and send us your own ideas for the “Data Chef Battles” at datachef@knime.com.

Ingredient Theme: Energy Consumption Time Series. Behavioral Measures over Time and Seasonality Index from Auto-Correlation.

Author: Rosaria Silipo
Data Chefs: Haruto and Momoka

Ingredient Theme: Energy Consumption Time Series

Let’s talk today about electricity and its consumption. One of the hardest problems in the energy industry is matching supply and demand. On the one hand, over-production of energy can be a waste of resources; on the other hand, under-production can leave people without the basic commodities of modern life. The prediction of the electrical energy demand at each point in time is therefore a very important chapter in data analytics.

For this reason, a couple of years ago energy companies started to monitor the electricity consumption of each household, store, or other entity, by means of smart meters. A pilot project was launched in 2009 by the Irish Commission for Energy Regulation (CER).

The Smart Metering Electricity Customer Behaviour Trials (CBTs) took place during 2009 and 2010 with over 5,000 Irish homes and businesses participating. The purpose of the trials was to assess the impact on consumers’ electricity consumption, in order to inform the cost-benefit analysis for a national rollout. Electric Ireland residential and business customers and Bord Gáis Energy business customers who participated in the trials, had an electricity smart meter installed in their homes or on their premises and agreed to take part in research to help establish how smart metering can help shape energy usage behaviors across a variety of demographics, lifestyles, and home sizes. The trials produced positive results.  The reports are available from CER (Commission for Energy Regulation) along with further information on the Smart Metering Project. In order to get a copy of the data set, fill out this request form and email it to ISSDA.

The data set is just a very long time series: one column covers the smart meter ID, one column the time, and one column the amount of electricity used in the previous 30 minutes. The time is expressed in number of minutes from 01.01.2009 : 00.00 and has to be transformed back to one of the classic date/time formats, like for example dd.MM.yyyy : HH.mm. The original sampling rate, at which the used energy is measured, is every 30 minutes.

The first data transformations, common to all data chefs, involve the date/time conversion and the extraction of year, month, day of month, day of week, hour, and minute from the raw date.

Topic. Energy Consumption Time Series

Challenge. From time series to behavioral measures and seasonality

Methods. Aggregations at multiple levels, Correlation

Data Manipulation Nodes. GroupBy, Pivoting, Linear Correlation, Lag Column

Read more


07 Aug 2017thor

If you are a KNIME Server customer you probably noticed that the changelog file for the KNIME Server 4.5 release was rather short compared to the one in previous releases. This means by no means that we were lazy! Together with introducing new features and improving existing features, we also started working on the next generation of KNIME Servers. You can see a preview of what is there to come in the so-called distributed executors. In this article I will explain what a distributed executor is and how it can be useful to you.

Read more


31 Jul 2017greglandrum

2019-03-21: We added more comprehensive instructions that will be continually updated. Check out the new documentation here.

Read more


24 Jul 2017amartin

In this blog series we’ll be experimenting with the most interesting blends of data and tools. Whether it’s mixing traditional sources with modern data lakes, open-source devops on the cloud with protected internal legacy tools, SQL with noSQL, web-wisdom-of-the-crowd with in-house handwritten notes, or IoT sensor data with idle chatting, we’re curious to find out: will they blend? Want to find out what happens when IBM Watson meets Google News, Hadoop Hive meets Excel, R meets Python, or MS Word meets MongoDB?

Follow us here and send us your ideas for the next data blending challenge you’d like to see at willtheyblend@knime.com.

Today: A Recipe for Delicious Data: Mashing Google and Excel Sheets

A newer version of this blog post and workflow is available at https://www.knime.com/blog/GoogleSheet-meets-Excel-part2 using the new Google Sheets nodes available with KNIME Analytics Platform 3.5. These nodes make accessing the Google Sheets (private or public) a much easier task!

The Challenge

Don’t be confused! This is not one of the data chef battles, but  a “Will they blend?” experiment - which, just by chance, happens to be on a restaurant theme again.

A local restaurant has been running its business relatively successfully for a few years now. It is a small business. An Excel Sheet was enough for the full accounting in 2016. To simplify collaboration, the restaurant owner decided to start using Google Sheets at the beginning of 2017. Now (2017 with Google Sheets) she faces the same task every month of calculating the monthly and YTD revenues and comparing them with the corresponding prior-year values (2016 with Microsoft Excel). 

The technical challenge at the center of this experiment is definitely not a trivial matter: mashing the data from the Excel and Google spreadsheets into something delicious… and digestible. Will they blend?

Topic. Monthly and YTD revenue figures for a small local business.

Challenge. Blend together Microsoft Excel and Google Sheets.

Access Mode. Excel Reader and REST Google API for private and public documents.

Read more


17 Jul 2017rs

Continental, a leading automotive supplier, recently won the Digital Leader Award 2017 in the category “Empower People” for bringing big data and analytics closer to its employees with KNIME. Arne Beckhaus is the man behind this project. We are lucky enough today to welcome him for an interview in our KNIME blog.

Read more


03 Jul 2017Dario Cannone

In this blog series we’ll be experimenting with the most interesting blends of data and tools. Whether it’s mixing traditional sources with modern data lakes, open-source devops on the cloud with protected internal legacy tools, SQL with noSQL, web-wisdom-of-the-crowd with in-house handwritten notes, or IoT sensor data with idle chatting, we’re curious to find out: will they blend? Want to find out what happens when IBM Watson meets Google News, Hadoop Hive meets Excel, R meets Python, or MS Word meets MongoDB?

Read more


19 Jun 2017knime_admin

In a social networking era where a massive amount of unstructured data is generated every day, unsupervised topic modeling has became a very important task in the field of text mining. Topic modeling allows you to quickly summarize a set of documents to see which topics appear often; at that point, human input can be helpful to make sense of the topic content. As in any other unsupervised-learning approach, determining the optimal number of topics in a dataset is also a frequent problem in the topic modeling field.

Read more


07 Jun 2017phil

Everyone who has heard of KNIME Analytics Platform knows that KNIME has nodes. Thousands of them! The resources under the Learning Hub as well as the hundreds of public examples within KNIME Analytics Platform are all designed to get you up to speed with KNIME and its nodes. But those that know best how to use KNIME nodes are KNIME users themselves. What if we could capture all their insight and experience in understanding which nodes to use when and in what order and give you a recommendation? Well that is exactly what the KNIME Workflow Coach does.

Read more


22 May 2017rs

Do you remember the Iron Chef battles

It was a televised series of cook-offs in which famous chefs rolled up their sleeves to compete in making the perfect dish. Based on a set theme, this involved using all their experience, creativity and imagination to transform sometimes questionable ingredients into the ultimate meal.

Hey, isn’t that just like data transformation? Or data blending, or data manipulation, or ETL, or whatever new name is trending now? In this new blog series requested by popular vote, we will ask two data chefs to use all their knowledge and creativity to compete in extracting a given data set's most useful “flavors” via reductions, aggregations, measures, KPIs, and coordinate transformations. Delicious!

Want to find out how to prepare the ingredients for a delicious data dish by aggregating financial transactions, filtering out uninformative features or extracting the essence of the customer journey? Follow us here and send us your own ideas for the “Data Chef Battles” at datachef@knime.com.

Ingredient Theme: Customer Transactions. Money vs. Loyalty.

Read more


08 May 2017knime_admin

Authors: Iris Adä & Phil Winters

The benefits of using predictive analytics is now a given. In addition, the Data Scientist who does that is highly regarded but our daily work is full of contrasts. On the one hand, you can work with data, tools and techniques to really dive in and understand data and what it can do for you. On the other hand, there is usually quite a bit of administrative work around accessing data, massaging data and then putting that new insight into production - and keeping it there.

Read more


Subscribe to KNIME news, usage, and development