From Descriptive to Predictive Analytics with KNIME
Seagate has recognized that the digital transformation is something that cannot be ignored. Every day at one of Seagate’s many global sites, people are interacting with or requiring results out of data. Moving up the analytics maturity curve - from descriptive to predictive analytics - was an important strategic objective. To achieve this, a tool that could easily be rolled out across the entire organization, was simple for others to learn, and could slot in with existing tools and infrastructure such as Excel, JMP, Minitab, Tableau, Matlab, Python, and R was needed.
In 2017, KNIME was selected as the tool of choice for Seagate’s data needs. The company implemented a tailored and unique Citizen Data Scientist (CDS) training program to train employees in using not only KNIME Analytics Platform, but also other complimentary data science tools. The program consisted of a mix of online and onsite trainings as well as workshops to teach users and advocate for the CDS program. Some of those attendees are now trainers and advocates themselves.
To help excite and encourage Seagate employees, the main messages shared at these trainings and workshops were the importance of 1) moving with the digital transformation to stay relevant in the industry and 2) upskilling employees to become citizen data scientists and to enable them to get the most out of their data, independently.
How Seagate is Benefiting from Using KNIME
Creating Demos in Three Days
With new products or product ideas, there is always an issue with the data. Some ideas are researched for many years, with different reasons for delay. For a long time, the US R&D engineers had a long wait time to get the data they needed. KNIME Analytics Platform was used to demo the idea of pulling the data from Asia with a multi-threading scheme. It took approximately two weeks to fine tune everything, but this action is what really started to highlight the benefit of the CDS program and, more importantly, the value of using KNIME Analytics Platform. What people had been struggling with for almost a year, had taken just a few days to demo a solution. The magic behind it is that KNIME allows for multi-thread parallel queries by simply dragging and dropping nodes that would otherwise need sophisticated coding skills.
Enabling Teams to be More Flexible and Creative
In another case, KNIME is enabling the Research and Product Development team to be more creative and productive. The team’s goal is to provide creative product solutions. There’s no such thing as routine because there’s always the need to change and adapt to the way data is analyzed. Furthermore, there’s always a new way of doing things, new data to read, new structures, new formats, and more.
KNIME Analytics Platform is the perfect solution for prototyping new ideas and presenting these ideas and results quickly. "I would describe prototyping in KNIME as similar to building with Lego,” says Debin Wang, Staff Engineer. “There are many different building blocks to choose from.” KNIME, just like Lego, encourages individuals to be creative and imaginative when building something. And the best part: when R&D teams have a new requirement or change in the data structure, the KNIME workflow can quickly and easily adapt to it. In R&D, not everything works out – especially not the first time. KNIME helps to significantly shorten the exploratory cycle. What previously took an engineer weeks in line-by-line coding, can now be done in a few days. This is because KNIME, compared to other data science tools, is more flexible and intuitive.
Enabling Savings of Over $1,000,000
KNIME is used by the Recording Head Engineering Group for the dynamic modeling of downstream metrics wafers to sliders. Recording head manufacturing is very complex. The sequential layering process includes more than 1,500 steps, which form patterns of electrical conductors and magnetic material on a ceramic disk (wafer). A single wafer takes more than four months to complete. Even then, it can still require subsequent processing downstream in the supply chain before testing takes place, which could highlight processing issues. One of these 1,500 steps previously had an average duration of six months between processing and testing. That meant six months of potentially at-risk material before a fault was detected. Using KNIME, an advanced modeling workflow was created to accurately predict expected results. The user-friendly nature of KNIME also enabled Seagate to integrate the model into the existing wafer fab control system. This reduced the feedback loop from six months to four weeks, and enabled savings of over $1,000,000 for this single business area.
In another example, a similar methodology was used. Here the team created a KNIME a workflow, which resulted in saving $300,000 worth of scrap materials. This workflow was able to predict the materials required for future processes in a different process area, which would have otherwise been scrapped under the existing univariate process control (SPC) system.
In a third example, the team reduced the time spent in ensuring that the two wafer fabrication facilities (Minnesota and Northern Ireland), were completely in sync. Existing systems required a significant number of monitoring hours per week. A KNIME workflow was built to cater for the high number of false positives that the team was dealing with – eliminating them from the review process and saving valuable time each week.
Over 150 Users Working Independently with Data
The Citizen Data Scientist program has been extremely effective. Since launching in 2017, there are over 100 general users (learners, practitioners, and analysts) and almost 50 power users (KNIME/Seagate evangelists). Employees are working more independently with their data and getting better insights, faster. They are also able to generate significant business savings – in terms of both time and money – by developing workflows and solutions to overcome business challenges or pain points. Moreover, the feeling of empowerment has been a significant motivator for Seagate employees. “KNIME has empowered people who previously may have not considered the discovery and application of machine learning techniques to dip their toes into the world of data science” says Brendan Doherty, Staff Data Scientist, Seagate Technology.
In 2019, Seagate purchased a KNIME Server and it’s predicted that the number of both general and power users will continue to increase. Looking forward, the CDS program will likely involve AWS so that learners have access to KNIME Server applications on AWS in order to do sandbox or development work. Seagate is already getting into the next level of maturity with predictive analytics and is starting to see tremendous business impact.
With the objective of wanting to move up the analytics maturity curve, it was important to find a platform that was both strong in Guided Analytics and machine learning. KNIME was chosen not only because it checked these boxes, but because of its openness and simplicity in getting started. Attending one of the KNIME summits in Austin, Texas, where many other practitioners shared their KNIME stories, also made it clear that there was a strong community and passionate team behind the software.
It was very easy to get people excited about KNIME because it was (and continues to be) easy for them to learn the tool and see significant results quickly. Employees who have completed the CDS program are now some of the strongest advocates and evangelists – spreading the word and getting others on board.
Other reasons why KNIME has become so popular at Seagate include the ability to seamlessly integrate with other tools and technologies such as Python, R, Excel, Tableau, and more. Non-coders are able to use it independently, but those who do want to code, can still do so. And lastly, it’s more than just data analytics – it’s data engineering, ETL, and more.
This Success Story is available here as a downloadable PDF.