Schools Wikipedia


Schools Wikipedia

This example demonstrates the usage of the network mining plug-in based on a directed network created from the 2008/9 Wikipedia Selection for schools (Schools Wikipedia), which is a free, hand-checked, non-commercial selection of the English Wikipedia funded by SOS Children's Villages. It has been created with the intention to build a child safe encyclopedia. It has about 5500 articles and is about the size of a twenty volume encyclopedia (34,000 images and 20 million words). The encyclopedia contains 154 subjects, which are grouped into 16 main subjects such as countries, religion and science.

The network has been created from the Schools Wikipedia version created in October 2008. Each article is represented by a node and the subjects are represented by partitions. Every article is assigned to one or more partitions depending on the assigned subjects. Hyperlinks are represented by directed links with the article that contains the hyperlink as the source and the referenced article as the target node.

Notice: The network consists of more than 5000 nodes and 200000 edges. In order to process the workflows efficiently you should increase the available memory for KNIME to 1 GB. For a description on how to increase the Java Heap Space, see the FAQ section.


Network Analysis (01201001_networkAnalysisSchoolsWiki)


Network Analysis Workflow

This workflow demonstrates how to perform some basic network analysis tasks such as finding the biggest connected components, the top 50 nodes with the highest in/out degree and the shortest path between two nodes using the

Shortest Path
Shortes Path node.


Partition Analysis (01201002_partitionAnalysisSchoolsWiki)


Partition Analysis Workflow

This workflow demonstrates how to perform some basic partition analysis tasks such as filtering nodes that belong to a specific partition or analyzing the connectivity between partitions of the Schools Wikipedia graph using the

Partition Graph Creator
Partition Graph Creator node.