RSS Feed Reader
This workflows downloads the most recent New York Times news feeds, extracts the titles and text, recognizes named entities in the news and visualizes these named entities as Tag Cloud.
This workflows downloads the most recent New York Times news feeds, extracts the titles and text, recognizes named entities in the news and visualizes these named entities as Tag Cloud.
The workflow builds a text stream visualization of a story, where we can see the frequency fluctuation of each character's mention as the stoy progresses. The Stacked Area Chart is used for the visualization and the story is one of the most popular written by the Grimms brothers, called Little Red Riding Hood.
The workflow shows how to use a Document Vector Adapter node in order to adjust the feature space of a second set of documents to make it identical to the feature space of a first, reference set of documents.
This workflows shows a simple example on how to lemmatize terms in documents using the Stanford Lemmatizer node and also to show what exactly the Lemmatizer does to the input document terms, in comparison to other preprocessing nodes, for example the Snowball Stemmer.
Here we use word embedding instead of hot encoding, using a Word2Vec Learner node. The hidden layer size is set to 10, therefore producing an embedding with very small dimensionality. Output of the Word2Vec Learner node is a model. Vocabulary Extractor node extracts the words from the model vocabulary and provides their embedding in form of collection. Collection items are isolated using a Split Collection column node and the distances between word emebedding vectors are calculated.
This workflow demonstrates how to apply a fuzzy matching of two string. The string matcher was designed exactly for this task, but is limited to the levenshtein distance. You can edit the parameters of the levenshtein distance in the configuration dialog.
This workflow shows how to extract topics from text documents using the Topic Extractor node, and how to determine an optimal number of topics using the Elbow method.
This workflows provides two dictionaries that are used for tagging documents. The documents are fairy tales from Brothers Grimm and the dictionaries contain names like "little red riding hood" or "Rumpelstiltskin" to find and tag in the documents.
This workflows downloads the most recent New York Times news feeds, extracts the titles and text, recognizes named entities in the news and visualizes these named entities as tag Cloud.
This workflow shows how to import textual data, preprocess documents by filtering and stemming, transform documents into a bag of words, and finally visualize them using a Tag Cloud.