A fast, highly automated, and accurate decision making process
Webbankir clients submit an online loan application and, if approved, receive their money directly on their card without visiting an office or banking institution. This means the decision making process must be fast, highly automated, and accurate. Furthermore, the process considers data from both external data providers, as well as the client's history.
An application to build and validate machine learning models was required to create an automated online loan application tool. The tool needed to be easy to work with, use visual development methods i.e. no coding, include a wide range of built-in models and algorithms, and easily connect to various data sources (specifically database management systems such as Amazon Redshift and PostgreSQL), as well as text and Excel files. It also needed to integrate with Python and R.
One visual environment for developing and validating models
KNIME was chosen because it not only ticked all the required boxes, but getting up and running with the software was simple. Webbankir was also able to successfully develop and validate models because KNIME had all the required model types in the core nodes or in the extensions. This helped solve one of the problems in the previous system, which was transferring ready-made models from development into production. In that system, the models were encoded in a language that didn’t support machine learning tools - meaning the implementation took a long time and often contained errors that were only detectable after lengthy testing. This resulted in the implementation of the model in the actual decision making process taking between one and two months.
The goal of this project was two-fold. Firstly, build a complete solution management system that included the entire path from model development to implementation - and enable any model class to be used. Secondly, reduce the time needed for model implementation to a maximum of seven days to guarantee an accurate transfer of the model from the development environment to the production environment and allow for automated testing.
The project included integrating 1C (the existing decision making system), with KNIME Server and creating a new decision making service based on a combination of KNIME Analytics Platform and KNIME Server. This new service would replace the decision flow for clients completing their first loan application with Webbankir. It would also be part of the solution for repeat clients. Additionally, the project included building a test system for automatically testing the functionality of such a service.
A service to process 200,000 requests per month in less than ten seconds per request
The decision making service needed to have an API running on a cloud platform with a processing capacity of at least 200,000 requests per month, a peak load of up to 5,000 requests per hour, and an average execution time of no more than 10 seconds. The importance of this project was considerable, because it’s extremely important to be able to quickly and accurately respond to changes in the environment. For example, at one point a new scoring model was added to fine tune the decision making process and decrease the default rate. Previously, this would have taken up to two months. However, with the new process it took only seven days - including model development, transferring into production, and testing. In another example, due to Covid-19, the set of decision making rules was changed five times in one month and each time this happened, took less than two hours to complete.
The working solution
The project started with defining the type of future service (API) and the sequence of how the functionality should be implemented. The integration points with the existing system were also defined, as well as the input and output vector datasets. In terms of construction, the following key steps were completed:
- Integrate with the existing/legacy decision making system
- Create an MVP – in this case an API service with default values of output vector fields
- Implement models and calculate indicators used in decision making, and run the calculation in test mode when the output vector was fixed but did not participate in the decision
- Implement the full decision logic in test mode
- Start using the service's responses in the actual decision making process
- Create a system for automated testing of the service
The decision making service is run on Amazon Web Services (AWS), which was identified as the easiest way to try out the technology. AWS provided an Amazon Machine Image for KNIME Analytics Platform and Server instance with scalable performance, which was an important consideration due to a) the required processing capacity and b) the need to respond quickly to changes in the external environment. Furthermore, given the sensitivities of the data, AWS enables KNIME to be containerized and keeps data secure. This pre-packaged and pre-defined instance is a cost-effective solution.
In terms of specific KNIME functionality, the integration with the main system consisted of using POST requests to the API service. The KNIME Server Connector, KNIME Server Executor, KNIME Python Scripting Extensions were used, as well as the Palladian, Logistic Regression, Gradient Boosting, PMML, Workflow Control nodes, and more.
Increasing the share of fully automated decisions from 75% to 85%
Previously, the time taken to implement new models was one to two months. Using KNIME Analytics Platform, this now takes one to seven days. The new service has also completely freed up one full time software developer who can now work on other important tasks and initiatives. It has reduced decision making time by 30%, and increased the share of fully automated decisions from 70% to 85%. Currently, the developed service takes part in decision-making on more than 95% of the application flow.
The initial plan was to build this service using Python, however, there were difficulties in transferring the developed models. Furthermore, data scientists were unable to work independently because software developers were always required, therefore consuming additional business resources. One of the strongest arguments for choosing KNIME was the openness and ability to read in and work with many data types and integrate with other machine learning tools, as well as automate the decisions being made.
In terms of software capabilities, KNIME allows almost any type of machine learning model to be used. It can also complete almost any kind of required data transformation needed for a given project. The visual drag and drop way of building a workflow, not only speeds up the work on models and processes, but it also enables non-coders to work freely and independently. The built-in Python code injection capabilities allow users to implement conversions (if necessary) that are implemented in complex or non-obvious ways using standard KNIME nodes. The scalability of KNIME Server enables the desired results in terms of load and processing speed to be achieved by drawing only on the resources that are required. KNIME Server also allows the team to focus on machine learning related aspects, because they do not need to be occupied by any performance or reliability concerns.
In terms of business considerations, KNIME enables employees from the risk team to implement decision making logic completely independently. They can develop and validate the models, as well as transfer these models to the service, and change the decision making rules. The models are transferred using PMML files, which ensures correct and fast migration. This approach is efficient and enables the team to quickly and easily put the developed functionality into production. When needed, the KNIME team provides support and guidance.