This workflow generates some measures about the forum, like number of users, number of posts, number and percent of answered vs. unanswered posts, number and percent of KNIME answers and community answers, and more.
This workflow performs a supervised topic classification on the forum posts. The training set consists of the description files of the KNIME nodes. Topic classes are the nodes top categories in the Node Repository (IO, Data Manipulation, etc ...) from KNIME versions prio to 3.0. Model is built on this training set and applied to forum posts. Top three topics with highest probability are chosen for the post topic class. A Tree Ensemble is used as classification model.
This workflow demonstrates how one can parse the KNIME Forum. We work in different stages. First we get read the list of topics from the fron page of the forum. Afterwards we go to each category separately. In each category we are searching for all topics which are newer than 9 days. This limitation is done mainly to speed up the workflow. If there is a next page avaiable, we also take those into consideration. After parsing the different thread pages, we are reading all information from the individual topics.
This workflow draws the network of contacts in the forum and the word cloud of the posts for each forum section.