The Challenge: Create Automatic, Personalized Email Campaigns
Recently, Würth launched an e-commerce business alongside their traditional bricks and mortar sales channel. To increase customer engagement and number of online purchases, personalized offers were needed that could be put in front of potential customers. The challenge was to automatically create personalized content based on users’ behaviors, preferences, and habits (particularly in the online shop), and to send them a regular email newsletter. The objective of these campaigns was to build a stronger relationship with customers, by promoting relevant information and offers.
The Solution: A Big Data Recommendation Engine
Effective campaigns depend on how well data are collected and measured. The first step was to analyze the information from the online shop, as well as other relevant sources such as the data warehouse and CRM, to create relevant content. Then, together with KNIME Partner Miriade, a Big Data Recommendation Engine was developed. This was done using a mathematical collaborative filtering algorithm. Given a matrix in which the rows correspond to the users and the columns to the products, the algorithm completes all empty intersections by providing what the customer might like, which is based on purchases of similar customers.
The algorithm is based on the available data. The more accurate the data, the more precise the algorithm. To achieve this, the data that were previously stored in different databases, were collected in a single Apache Hadoop system, cleaned, and made ready for algorithmic processing. The cleaned data were then categorized based on predetermined rules, which assign specific values according to business needs. Once the data were prepared, the algorithm identified, and rules established, the analysis was done. The main steps included:
Install the cluster where the data is processed. This is usually the moment the KNIME Server is installed.
Secure the infrastructure.
Ingest data from selected sources. This is the most delicate phase as data is taken from the data warehouse to the HDFS Hadoop distribution using Spark and KNIME nodes. Hadoop enables data to be separated from the table and considered as metadata. Here, a flat table is used, in which the relevant data are saved in Parquet for fast elaboration.
Conduct predictive analytics. The starting point is the model creation, which is designed as a KNIME workflow, to get the complete rating for user-item association. Once the technical infrastructure is done and the data flow is stabilized, the analysis is performed with the selected algorithm and rating.
All workflows - for both data ingestion and model generation - are scheduled using the KNIME Server Scheduler. All workflows include variables, which ensures all processes are automated and self-governed. KNIME WebPortal is used to create visual dashboards - including Business Intelligence KPIs and plots.
The entire project took 4-5 months to complete - from the very first meeting with the customer to discuss the objectives through to the full implementation. The result is automatically generated, personalized marketing campaigns that promote Würth products. With KNIME, the process is automated and scheduled, ensuring an accurate and efficient campaign.
Why KNIME Software
KNIME makes it easy to share work and workflows between users and stakeholders, which made it possible for Miriade to create the solution and later share it with Würth. This was one of the main reasons KNIME was chosen. The workbench and visual programming make it very easy to get started with data science because the data flow is shown in a clear and intuitive way – even for first time users. KNIME offers machine learning nodes for all phases of the lifecycle: from data ingestion and manipulation, through to model training, visualization, and deployment. Working with KNIME and Hadoop was a seamless experience, and everything was controllable from within the KNIME workbench. The combination of KNIME and Hadoop make it very easy to manage the large amount of data as well as the expected amounts of data in the future.