Tipps and trick using knime.

From Modeling to Scoring: Correcting Predicted Class Probabilities in Imbalanced Datasets

Mon, 10/07/2019 - 10:00 Maarit

Authors: Alfredo Roccato (Data Science Trainer and Consultant) and Maarit Widmann (KNIME)

Wheeling like a hamster in the data science cycle? Don’t know when to stop training your model?

Model evaluation is an important part of a data science project and it’s exactly this part that quantifies how good your model is, how much it has improved from the previous version, how much better it is than your colleague’s model, and how much room for improvement there still is.

In this series of blog posts, we review different scoring metrics: for classification, numeric prediction, unbalanced datasets, and other similar more or less challenging model evaluation problems.

Time Series Analysis: A Simple Example with KNIME and Spark

Mon, 09/23/2019 - 10:00 admin

The task: train and evaluate a simple time series model using a random forest of regression trees and the NYC Yellow taxi dataset

Authors: Andisa Dewi and Rosaria Silipo

I think we all agree that knowing what lies ahead in the future makes life much easier. This is true for life events as well as for prices of washing machines and refrigerators, or the demand for electrical energy in an entire city. Knowing how many bottles of olive oil customers will want tomorrow or next week allows for better restocking plans in the retail store. Knowing the likely increase in the price of gas or diesel allows a trucking company to better plan its finances. There are countless examples where this kind of knowledge can be of help.

Transfer Learning Made Easy with Deep Learning Keras Integration

Mon, 09/16/2019 - 10:00 Corey

Author: Corey Weisinger

You’ve always been able to fine tune and modify your networks in KNIME Analytics Platform by using the Deep Learning Python nodes such as the DL Python Network Editor or DL Python Learner, but with recent updates to KNIME Analytics Platform and the KNIME Deep Learning Keras Integration there are more tools available to do this without leaving the familiar KNIME GUI.

Accessing the HELM Monomer Library with KNIME

Mon, 09/09/2019 - 10:00 longoka

Author: Kenneth Longo

The cheminformatics world is replete with software tools and file formats for the design, manipulation and management of small molecules and libraries thereof. Those tools and formats are often specialized in analyzing small molecules of ~500 daltons, give or take a few, or those molecules that can reasonably be drawn and understood using classic ball-and-stick or molecular coordinate frameworks. Perhaps not coincidentally, this neatly envelops the needs of small molecule drug discovery, where it is not uncommon to find both public and privately-held repositories of hundreds of thousands (to millions) of such molecules, for use in molecular or phenotypic screening assays. The small size and elemental simplicity of these molecules has resulted in a variety of storage file formats (e.g., mol, SMILES, sdf, etc) and many supporting software packages (e.g., RDkit, CDK, ChemAxon, etc) for visualization and manipulation that support them. KNIME Analytics Platform provides easy access to those file formats and software packages.

KNIME Analytics Platform 4.0: Components are for Sharing

Thu, 06/27/2019 - 10:00 michael.berthold
With this release we are continuing our progress toward a community oriented data science platform, adding lots of functionality that enables easier sharing with the KNIME Community. Most noticeably, of course, the KNIME Hub itself but there are also a number of changes in KNIME Analytics Platform making sharing and collaborating with the community easier...

From Modeling to Scoring: Finding an Optimal Classification Threshold based on Cost and Profit

Mon, 06/17/2019 - 10:00 Maarit

Authors: Maarit Widmann (KNIME) and Alfredo Roccato (Data Science Trainer and Consultant)

Wheeling like a hamster in the data science cycle? Don’t know when to stop training your model?

Model evaluation is an important part of a data science project and it’s exactly this part that quantifies how good your model is, how much it has improved from the previous version, how much better it is than your colleague’s model, and how much room for improvement there still is.

In this series of blog posts, we review different scoring metrics: for classification, numeric prediction, unbalanced datasets, and other similar more or less challenging model evaluation problems.

From Modeling to Scoring: Confusion Matrix and Class Statistics

Mon, 05/27/2019 - 10:00 Maarit

Author: Maarit Widmann

Wheeling like a hamster in the data science cycle? Don’t know when to stop training your model?

Model evaluation is an important part of a data science project and it’s exactly this part that quantifies how good your model is, how much it has improved from the previous version, how much better it is than your colleague’s model, and how much room for improvement there still is.

In this series of blog posts, we review different scoring metrics: for classification, numeric prediction, unbalanced datasets, and other similar more or less challenging model evaluation problems.

Data Chef ETL Battles - Today, WebLog Data for Clickstream Analysis

Mon, 04/15/2019 - 10:00 heather.fyson

Authors: Maarit Widmann, Anna Martin, Rosaria Silipo

Do you remember the Iron Chef battles?

It was a televised series of cook-offs in which famous chefs rolled up their sleeves to compete in making the perfect dish. Based on a set theme, this involved using all their experience, creativity, and imagination to transform sometimes questionable ingredients into the ultimate meal.

JS show it! Today: Interactive Choropleth World Map using Google GeoChart visualization

Mon, 05/14/2018 - 10:26 admin

JavaScript Nuggets on Demand

KNIME Analytics Platform is extremely flexible. It offers not only a number of pre-packaged functionalities for prototyping or routine work, but also a number of integrations for the free coding days. One of these integrations imports the power of JavaScript code into the platform.

This blog post series aims at providing nuggets of JavaScript code to implement more creative drawing and plots than what is already available with the pre-packaged nodes. The nuggets of JavaScript code proposed here implement only one functionality and are explained step by step for all, even the JavaScript beginners, to understand.

Today: Interactive Choropleth World Map using Google GeoChart visualization

Authors: Rosaria Silipo & Paolo Tamagnini

Figure 1. A choropleth map is a geographical map where areas are colored, shaded, or patterned according to a corresponding calculated measure, in this case logarithmic number of 2013 population on a world map.

The Plot

Today we want to draw the choropleth map as shown above. So what do we need?

  • A map of the countries of the world and the corresponding numbers of population.
  • A short JavaScript code to load the Google Charts library and draw the choropleth map based on the population numbers of each country.
  • A Generic JavaScript View node to execute such code within a KNIME workflow.

Our dataset is the CSV file population2013.csv and it contains a list of 214 world countries with their corresponding population numbers as of 2013.

We also have a generic JavaScript View node. The smallest workflow would simply include a File Reader node to read the CSV file and a Generic JavaScript View node with the right JavaScript code nugget to draw the choropleth map. So let’s now have a look at this nugget of JavaScript code.

Distributed executors in the next major version of KNIME Server

Mon, 08/07/2017 - 11:25 thor

If you are a KNIME Server customer you probably noticed that the changelog file for the KNIME Server 4.5 release was rather short compared to the one in previous releases. This means by no means that we were lazy! Together with introducing new features and improving existing features, we also started working on the next generation of KNIME Servers. You can see a preview of what is there to come in the so-called distributed executors. In this article I will explain what a distributed executor is and how it can be useful to you.

Subscribe to KNIME Blog: tech