Productionize

Extend the Reach of Data Science. Equip your Team for Flexible Deployment

April 19, 2021 — by Jim Falgout
Extend the Reach of Data Science. Equip your Team for Flexible Deployment

To enable your data science team for flexible deployment you need a tool that can handle multiple technologies, specialty hardware, and fluctuating demands while maintaining continuous high quality delivery.

In this article we address use cases we’ve been asked about from teams and organizations.

Multiple data sources: run on premise or in the cloud

In the field of sales analytics, the challenge is to get all the different data sources together and governed in a reliable manner with easy to use reporting across the organization. When a company is dealing with data as a service, security issues entail some data sources needing to be run within the domain of the company.

Hybrid deployment, with KNIME Server, is a mix of enterprise data center and cloud deployments. Specific KNIME Executors can be run on premise, for example, to ensure compliance with the security regulations as required by those data sources, while other data are managed by Executors in the cloud.

If required you can assign multiple Executors to multiple purposes under a PAYG license, meaning no big upfront investment. If one business unit needs to integrate a new data source for example to try out a new algorithm and perform some heavy data crunching, they can switch to a different executor without affecting the existing environment. Scaling Executors to your compute needs decreases time to market.

Efficient distribution: expand or partition resources

Enterprises need to ensure efficient distribution of resources and provide different execution environments to different departments within a company. The agility to partition resources augments capacity. This can be handled flexibly with KNIME Executor Groups.

Your IT unit can create specific Executor Groups to serve specific users and groups, partitioning compute resources and segregating execution resources logically, e.g. by department.

Extend the reach of data sciences with flexible deployment
Partitioning resources with Executor Groups to augment capacity

 

In the graphic, you can see how workflows are subsequently always routed to the correct Executor Group. This guarantees availability of execution resources and meeting the company’s requirements to serve large numbers of diverse users.

Watch recording of talk "Deep Dive - Cloud and Hybrid Deployment" by Siemens Healthineers

Computation requirements: with or without specialty hardware

If you have large numbers of models that need updating on a monthly basis to refresh them with new data, for example, this requires many hours of processing on specialty hardware, e.g., GPUs. State-of-the-art technology is expensive and develops at a startling rate. KNIME Executors can be run in Amazon EC2 Auto Scaling groups on AWS, elastically scaling to support your computation needs.

So if you need to gain access to a GPU to train those models faster you can define your resources when you launch the Executors, choosing which Executor type a specific workflow should be executed on.

Varying workloads: scale in or out elastically

Transactional fraud mitigation is an area where data science techniques are being used to produce good results. Successful solutions are capable of processing literally millions of the digital interactions between financial institutions and their customers. And workloads can soar. During holiday seasons, for example, the pressure is on to process these huge amounts of data and make results available quickly. 

When the business workload is then also tricky to predict reliably, elastic scalability fits resources to cope dynamically. Cloud platforms, for example, provide auto scaling, and KNIME Executors support running in these environments. They are available on the AWS Marketplace and on the Microsoft Azure Marketplace as PAYG (pay-as-you-go) options. New Executors are started as required and conversely shut down when load decreases saving money in both instance and license costs.

The marketplace offerings also mean you can run hybrid on-prem/cloud setups - a mix of enterprise data center and cloud deployments. Install KNIME Server on-prem and start up your KNIME Executors in the cloud (on demand - PAYG) to cover periods of high demand dynamically.

Flexible pricing: pay as much or as little as you use

A typical use of data science in the e-commerce business is to predict next best offers. Subsequent special discounts offered to customers can triple traffic to the website. You need a solution that can quickly scale to handle all the incoming requests. Oh, and flexible pricing would be good too.

This kind of situation is an ideal case for using PAYG KNIME Executors. A company purchases a baseline KNIME Server, which runs year round. Whenever there are large spikes in execution demand, they can automatically spin up new KNIME Executors to handle the increase. And as soon as things go back to normal, the additional instances are turned off again - meaning they only pay for what they use.

Businesses enabled to leverage enterprise infrastructure choices

Flexible deployment options help enterprises handle security challenges that go hand in hand with multiple data sources, enable efficient responses to different compute requirements and varying workloads, and also keep costs predictable.

Hear more about how other enterprises are using KNIME software for flexible deployment and how they meet typical business challenges in presentations from KNIME Fall Summit 2020.

You may also like
Productionize

KNIME Server Profiles Simplify DB Driver Installation

In the modern business environment, companies have to support a heterogeneous combination of operating systems and technologies. Defining customizations and pro...

December 2, 2019 – by Jan Lindquist