With KNIME 2.10 new distance nodes have been released that allow the application of various distances measures in combination with the clustering nodes k-Medoids and Hierarchical Clustering, the Similarity Search node, and the Distance Matrix Pair Extractor node. Besides numerical distances such as p-norm distances (Euclidean, Manhattan, etc.), or cosine distance, also string distances, and byte and bit vector distances are provided. On top distances can be aggregated. If you still can't find the distance function you are looking for you can easily implement a customized distance with only one or two lines of Java code, using the Java Distance node. To get the nodes that make use of the new distances install the "KNIME Distance Matrix" extension (KNIME & Extensions -> KNIME Distance Matrix).
Familiar from common applications such as demographic analysis, visualizing election results, and mapping the outbreaks of disease, choropleths are an important visualization technique for aggregating spatial data. Today I want to discuss some techniques for generating such graphics in KNIME. So, to get us started, lets look at a very simple workflow to build such a visualisation.
If you are using the KNIME Server with the KNIME WebPortal in your organization you can easily share published workflows with your colleagues by using the URL parameters of the WebPortal. Not only is this a great way of linking directly to a specific workflow, but it also gives you the possibility to embed the WebPortal somewhere else in your corporate environment.
This powerful feature was introduced with KNIME Server 3.7 and some enhancements were added with version 3.8.
The new KNIME Twitter nodes allow you to search for Tweets on Twitter, retrieve information about users, post Tweets via KNIME and much more.
Programming is fun. At least many aspects of programming. What's usually not considered funny is writing documentation and... testing. For the former I agree, but I will show you that the latter can also be fun. Even (or especially) with KNIME, and even more with some of the nice additions to the testing framework in 2.10.
KNIME has made a strong case for openness: deliver a complete platform that’s free and open source; then make money by adding functionality to make that platform easier and more efficient to use. But how does that work in reality? The just-released KNIME version 2.10, complemented with the next-generation commercial extensions to the KNIME Analytics Platform, gives us a taste of what they’re cooking up for us.
Just in time for summer, the new version KNIME 2.10 has been released!
This release marks a new era for KNIME. It features a series of commercial products to incorporate big data strategies into the company analytics, enhance personal productivity, protect partners’ intellectual properties, and fit all collaboration and enterprise needs, from simple to challenging requests from both small-size companies and major multinational enterprises.
PMML, Ensembles and KNIME are three hot topics, each one worthwhile to be used. However, when combined together these three pieces offer an even more powerful approach to data analytics than each one alone. We would like to take the opportunity in this post to tell you more about these three puzzle pieces and, more importantly, about how to put them together.
By Phil Winters
It is indeed a well-deserved honor that KNIME’s leadership has been re-confirmed in Gartner’s famous ‘Magic Quadrant’ for analytic platforms – thanks in large part to the KNIME customers who acted as references. But is this just another award or an indicator of something much more significant? I think the latter.
The Time Series prediction Problem
Time series prediction requires the prediction of a value at time t, x(t), given its past values, x(t-1), x(t-2), …, x(t-n). How do you implement a model for time series prediction in KNIME? For time series prediction, all you need is a Lag Column node!
For example, I have a time series of daily data x(t) and I want to use the past 3 days x(t-1), x(t-2), x(t-3) to predict the current value x(t). This is an auto-prediction problem. Introducing exogenous variables, like y(t) and z(t), into the prediction model, turns an auto-prediction problem into a multivariate prediction problem. Let’s stick with auto-prediction. What we will build is easily extendible to a multivariate prediction.