31 Oct 2019berthold

Two decades into the AI revolution, deep learning is becoming a standard part of the analytics toolkit. Here’s what it means

By Michael Berthold, KNIME

Pick up a magazine, scroll through the tech blogs, or simply chat with your peers at an industry conference. You’ll quickly notice that almost everything coming out of the technology world seems to have some element of artificial intelligence or machine learning to it. The way artificial intelligence is discussed, it’s starting to sound almost like propaganda. Here is the one true technology that can solve all of your needs! AI is here to save us all!

Read more

28 Oct 2019admin

Authors: Ana Vedoveli and Iris Adä (KNIME)

At the beginning of this year, we sent out a “Help us to Help you with KNIME” survey to the KNIME community. The idea behind the questionnaire was to listen to what the KNIME community wanted and incorporate some of those suggestions into the next releases. There were a few questions about how people are using KNIME Analytics Platform, and also questions designed to help us understand what kinds of new nodes and features people dream about. We additionally promised that we would select one dedicated node - the node most mentioned - and make sure that it would be part of our next major release.

In this post we present this "community node" and we've also put together five tips & tricks garnered from other answers given in the survey.

Read more

24 Oct 2019admin

Authors: Kathrin Melcher, Rosaria Silipo

    Key takeaways
    • Fraud detection techniques mostly stem from the anomaly detection branch of data science
    • If the dataset has a sufficient number of fraud examples, supervised machine learning algorithms for classification like random forest, logistic regression can be used for fraud detection
    • If the dataset has no fraud examples, we can use either the outlier detection approach using isolation forest technique or anomaly detection using the neural autoencoder
    • After the machine learning model has been trained, it's evaluated on the test set using metrics such as sensitivity and specificity, or Cohen’s Kappa

    Read more

    21 Oct 2019julian.bunzel

    Author: Julian Bunzel

    Predicting the Purpose of a Drug

    Keeping track of the latest developments in research is becoming increasingly difficult with all the information published on the Internet. This is why Information Extraction (IE) tasks are gaining popularity in many different domains. Reading literature and retrieving information is extremely exhausting, so why not automate it? At least a bit. Using text processing approaches to retrieve information about drugs has been an important task over the last few years and is getting more and more important1.

    Read more

    17 Oct 2019admin

    Authors: Paolo Tamagnini and Rosaria Silipo

    The ugly truth behind all that data

    We are in the age of data. In recent years, many companies have already started collecting large amounts of data about their business. On the other hand, many companies are just starting now. If you are working in one of these companies, you might be wondering what can be done with all that data.

    What about using the data to train a supervised machine learning (ML) algorithm? The ML algorithm could perform the same classification task a human would, just so much faster! It could reduce cost and inefficiencies. It could work on your blended data, like images, text documents, and just simple numbers. It could do all those things and even get you that edge over the competition.

    Read more

    14 Oct 2019admin

    Authors: Scott Fincher, Paolo Tamagnini, Maarit Widmann

    Guided Visualization and Exploration

    No matter if we are experienced data scientists or business analysts, one of our daily routines is the easy and smooth extraction of the relevant information from our data regardless of the kind of analysis we are facing.

    A good practice for this is to use data visualizations: charts and graphs to visually summarize the complexity in the data. The required expertise for data visualization can be divided in two main areas:

    • The ability to correctly prepare and select a subset of the dataset columns and visualize them in the right chart
    • The ability to interpret the visual results and take the right business decisions based on what is displayed

    Read more

    07 Oct 2019Maarit

    Authors: Alfredo Roccato (Data Science Trainer and Consultant) and Maarit Widmann (KNIME)

    Wheeling like a hamster in the data science cycle? Don’t know when to stop training your model?

    Model evaluation is an important part of a data science project and it’s exactly this part that quantifies how good your model is, how much it has improved from the previous version, how much better it is than your colleague’s model, and how much room for improvement there still is.

    In this series of blog posts, we review different scoring metrics: for classification, numeric prediction, unbalanced datasets, and other similar more or less challenging model evaluation problems.

    Read more

    30 Sep 2019admin

    Author: Angus Veitch

    KNIME: a gateway to computational social science and digital humanities

    I discovered KNIME by chance when I started my PhD in 2014. This discovery changed the course of my PhD and my career. Well, who knows: perhaps I would have eventually learned how to do things like text processing, topic modelling and named entity extraction in R or Python. But with no previous programming experience, I did not feel ready to take the plunge into those platforms. KNIME gave me the opportunity to learn a new skill set while still having time to think and write about what the results actually meant in the context of media studies and social science, which was the subject of my PhD research.

    KNIME is still my go-to tool for data analysis of all kinds, textual and otherwise. I use it not only to analyse contemporary text data from news and social media, but to analyse historical texts as well. In fact, I think the accessibility of KNIME makes it the perfect tool for scholars in the field knowns as the digital humanities, where computational methods are being applied to the study of history, literature and art.

    Read more

    23 Sep 2019admin

    The task: train and evaluate a simple time series model using a random forest of regression trees and the NYC Yellow taxi dataset

    Authors: Andisa Dewi and Rosaria Silipo

    I think we all agree that knowing what lies ahead in the future makes life much easier. This is true for life events as well as for prices of washing machines and refrigerators, or the demand for electrical energy in an entire city. Knowing how many bottles of olive oil customers will want tomorrow or next week allows for better restocking plans in the retail store. Knowing the likely increase in the price of gas or diesel allows a trucking company to better plan its finances. There are countless examples where this kind of knowledge can be of help.

    Read more

    16 Sep 2019Corey

    Author: Corey Weisinger

    You’ve always been able to fine tune and modify your networks in KNIME Analytics Platform by using the Deep Learning Python nodes such as the DL Python Network Editor or DL Python Learner, but with recent updates to KNIME Analytics Platform and the KNIME Deep Learning Keras Integration there are more tools available to do this without leaving the familiar KNIME GUI.

    Read more

    Subscribe to KNIME news, usage, and development