This workflow prepares a data set using Local Big Data Environment for Data Chefs Battle: Chemistry vs Biology. It collects results of biological experiments from PubChem database API and cleans them up
This flow demonstrates the functionality of the MoSS node for substructure search in molecules.
Automated access to disease information is an important goal of information extraction and text mining efforts. Here, we want to create a model that learns disease names in a set of documents from biomedical literature. We will automatically extract literature from PubMed and use these documents to train our model on an initial set of disease names (the dictionary). We score the resulting model and check if we can extract new information by comparing the detected disease names to our initial set.