Continental, a leading automotive supplier, recently won the Digital Leader Award 2017 in the category “Empower People” for bringing big data and analytics closer to its employees with KNIME. Arne Beckhaus is the man behind this project. We are lucky enough today to welcome him for an interview in our KNIME blog.
Rosaria: Arne, congratulations on winning the Digital Leader Award. We are very pleased to hear that your project builds on KNIME. Can you tell us more about it?
Arne: Thanks for the invitation. The essence of our project is to bring data analytics skills to non-IT employees in our business units. So we are talking about colleagues from purchasing, logistics, and even HR who have neither a programming nor a data science background, but have interesting problems to solve. For them, we implemented an internal training program about data analytics and big data that is completely based on KNIME products. Participating users often bring their data problems to the training and, if the problem is too complex, we support them by drafting an initial workflow. In this way, our users have the chance to solve their analytics problem and be trained in KNIME at the same time, thus optimizing learning speed and skill development.
Rosaria: What range of problems are you tackling with this approach?
Arne: Typically, our business managers and employees talk about big data as soon as data volumes exceed the capabilities of spreadsheet logic. My experience is that only a fifth of the real world problems actually require cluster computing in the league of Spark and Hadoop, which KNIME addresses via the KNIME Big Data Extension. For the remaining 80% of our real-world problems, we can make use of the great selection of standard KNIME nodes.
Rosaria: How do you make sure your users build valid models without a proper background in data science?
Arne: There is a large difference between building a real-time scoring predictor model on gazilions of data and the everyday data processing challenges of our thousands of business users. Most of our time, including data scientists’ time, is spent in what the IT people call ETL operations. Data cleaning, data blending, aggregations, data reorganization, etc … An example: A colleague from logistics used to work with a manual Excel-based filtering and visualization process, which took 10 minutes per part number that was analyzed. This work could not be prepared proactively in advance for the thousands of part numbers in the systems. It could only be executed on demand. In parallel to this colleague’s training, together, we developed a KNIME workflow to automatically prioritize an entire business unit’s critical spots in the supply chain on a weekly basis. This expertise-based solution generates valuable insights without any model training. So even though only typical ETL nodes are used in KNIME, I prefer to call these domain knowledge based workflows ‘deterministic analytics’.
Rosaria: By the way, with a Custom Node Repository in the KNIME server you could even restrict the analytics nodes available to your users.
Arne: Thanks for the tip. Looks like another great feature of KNIME’s customizability and open nature. This was actually one of the main reasons to select KNIME Analytics Platform for our project. The innovation speed, extensibility, openness, and smart surrounding commercial offers convinced us from day one.
Rosaria: Can you tell us more about your training approach? What is special about it?
Arne: At first, we only use domain related examples. So lots of automotive examples instead of B2C e-commerce, pharmaceutical, etc. That makes it easier for our users to relate to their real life challenges. Since we run this training internally with our own resources, we can also offer easy to consume (and easy to schedule) training cycles. A typical training wave consists of a 3-hour module per 5 consecutive weeks.
Rosaria: What are your key learnings from the project?
Arne: Reviewing our progress so far, I think there are three key success factors.
- Most important has been the definition of our target group: business users without programming skills. We enable them to utilize their domain knowledge for self-service analytics.
- Then, the combination of training and pilot projects has given us immediate results.
- Last but not least, we created awareness for the 80/20 rule in data analytics: Most of the effort lies in data integration and this can be done by the data owners themselves. I would even go beyond that and say there is a second 80/20 rule: from a business user perspective, 80% of our problems don’t require advanced analytics but can be solved by deterministic data workflows, e.g. by prioritizing thousands of cases according to a user-defined criticality KPI.
Rosaria: Thanks a lot Arne for these insights and the choice of KNIME as your chosen analytics platform. Congratulations again for winning the Digital Leader Award! Any final words from your side?
Arne: Thanks for the opportunity to share our findings. Please keep up the great work at KNIME! As a final remark, I would like to inspire others to follow our approach: You will be rewarded by enthusiastic employees who can unleash their creativity in working with data while at the same time leading to better business decisions on all levels of the organization!