Text Processing

Document clustering

This workflow shows how to import textual data, preprocess documents by filtering and stemming, transform documents into a bag of words and document vectors, and finally cluster the documents based on their numerical representation.

Document clustering

 

Sentiment Classification

This workflow shows how to import text from a csv file, convert it to documents, preprocess the documents and transform them into numerical document vectors. Finally a predictive model is trained on the vectors to predict the sentiment class of the documents.

Sentiment Classification

 

Sentiment Classification with NGrams

This workflow shows how to import text from a csv file, convert it to documents, preprocess the documents and transform them into numerical document vectors consisting of single word and 2-gram features.
Finally two predictive models are trained on the vectors to predict the sentiment class of the documents. The two models are then compared via a ROC curve.

Fuzzy String Matching

This workflow demonstrates how to apply a fuzzy matching of two string. The string matcher was designed exactly for this task, but is limited to the levenshtein distance. You can edit the parameters of the levenshtein distance in the configuration dialog.

Discover Secret Ingredient

On one side we have a list of cookie recipes saved in a Word Document on the local machine. On the other side we have a web page with another new recipe available through web crawling.We are exploring the ingredient lists on one side and on the other to discover the secret ingredient for the ultimate Christmas cookie. ... and yes! They blend.

Subscribe to Text Processing