18 Nov 2019Redfield

Anonymization is a hot topic of discussion. We are generating and collecting huge amounts of data, more than ever before. A lot of this data is personal and needs to be handled sensitively. In recent times, we’ve also seen the introduction of the GDPR stipulating that only anonymized data may be used extensively and without privacy restrictions.

Read more

14 Nov 2019admin

Author: Rosaria Silipo (KNIME)

As first published in Dataversity

Sometimes when you talk to data scientists, you get this vibe as if you’re talking to priests of an ancient religion. Obscure formulas, complex algorithms, a slang for the initiated, and on top of that, some new required script. If you get these vibes for all projects, you are probably talking to the wrong data scientists.

Read more

11 Nov 2019armingrudd

Author: Armin Ghassemi Rudd (Data Scientist & Consultant)

Are you trying to build an attractive CV? Maybe you’ve been searching the web for online CV builders? Using these online CV builders, you have to fill out a form and enter your information like name, contact information, skills, experiences, and so on. There are a few online CV builders that ease the job for you and ask for permission to access your LinkedIn profile and read your information. They are great tools for sure, but they have down points as well.

Read more

06 Nov 2019berthold

As first published in Harvard Data Science Review.


Given recent claims that data science can be fully automated or made accessible to nondata scientists through easy-to-use tools, I describe different types of data science roles within an organization. I then provide a view on the required skill sets of successful data scientists and how they can be obtained, concluding that data science requires both a profound understanding of the underlying methods as well as exhaustive experience gained from real-world data science projects. Despite some easy wins in specific areas using automation or easy-to-use tools, successful data science projects still require education and training.

Read more

04 Nov 2019admin

Authors: Rosaria Silipo and Mykhailo Lisovyi

Today’s style: Caravaggio or Picasso?

While surfing on the internet a few months ago, we came across this study1, promising to train a neural network to alter any image according to your preferred painter’s style. These kinds of studies unleash your imagination (or at least ours).

Read more

31 Oct 2019berthold

Two decades into the AI revolution, deep learning is becoming a standard part of the analytics toolkit. Here’s what it means

By Michael Berthold, KNIME

Pick up a magazine, scroll through the tech blogs, or simply chat with your peers at an industry conference. You’ll quickly notice that almost everything coming out of the technology world seems to have some element of artificial intelligence or machine learning to it. The way artificial intelligence is discussed, it’s starting to sound almost like propaganda. Here is the one true technology that can solve all of your needs! AI is here to save us all!

Read more

28 Oct 2019admin

Authors: Ana Vedoveli and Iris Adä (KNIME)

At the beginning of this year, we sent out a “Help us to Help you with KNIME” survey to the KNIME community. The idea behind the questionnaire was to listen to what the KNIME community wanted and incorporate some of those suggestions into the next releases. There were a few questions about how people are using KNIME Analytics Platform, and also questions designed to help us understand what kinds of new nodes and features people dream about. We additionally promised that we would select one dedicated node - the node most mentioned - and make sure that it would be part of our next major release.

In this post we present this "community node" and we've also put together five tips & tricks garnered from other answers given in the survey.

Read more

24 Oct 2019admin

Authors: Kathrin Melcher, Rosaria Silipo

    Key takeaways
    • Fraud detection techniques mostly stem from the anomaly detection branch of data science
    • If the dataset has a sufficient number of fraud examples, supervised machine learning algorithms for classification like random forest, logistic regression can be used for fraud detection
    • If the dataset has no fraud examples, we can use either the outlier detection approach using isolation forest technique or anomaly detection using the neural autoencoder
    • After the machine learning model has been trained, it's evaluated on the test set using metrics such as sensitivity and specificity, or Cohen’s Kappa

    Read more

    21 Oct 2019julian.bunzel

    Author: Julian Bunzel

    Predicting the Purpose of a Drug

    Keeping track of the latest developments in research is becoming increasingly difficult with all the information published on the Internet. This is why Information Extraction (IE) tasks are gaining popularity in many different domains. Reading literature and retrieving information is extremely exhausting, so why not automate it? At least a bit. Using text processing approaches to retrieve information about drugs has been an important task over the last few years and is getting more and more important1.

    Read more

    17 Oct 2019admin

    Authors: Paolo Tamagnini and Rosaria Silipo

    The ugly truth behind all that data

    We are in the age of data. In recent years, many companies have already started collecting large amounts of data about their business. On the other hand, many companies are just starting now. If you are working in one of these companies, you might be wondering what can be done with all that data.

    What about using the data to train a supervised machine learning (ML) algorithm? The ML algorithm could perform the same classification task a human would, just so much faster! It could reduce cost and inefficiencies. It could work on your blended data, like images, text documents, and just simple numbers. It could do all those things and even get you that edge over the competition.

    Read more

    Subscribe to KNIME news, usage, and development