Sentiment analysis of free-text documents is a common task in the field of text mining. In sentiment analysis predefined sentiment labels, such as "positive" or "negative" are assigned to texts. Texts (here called documents) can be reviews about products or movies, articles, etc.
In this blog post we show an example of assigning predefined sentiment labels to documents, using the KNIME Text Processing extension in combination with traditional KNIME learner and predictor nodes.
A set of 2000 documents has been sampled from the trainings set of the Large Movie Review Dataset v1.0. The Large Movie Review Dataset v1.0 contains 50000 English movie reviews along with their associated sentiment labels "positive" and "negative". For details about the data set see http://ai.stanford.edu/~amaas/data/sentiment/. We sampled 1000 documents of the positive group and 1000 documents of the negative group. The goal here is to assign the correct sentiment label to each document.