The IO category contains parser nodes that can parse texts from various formats, such as DML, SDML, PubMed (XML format), PDF, Word, and flat files. Parsing and reading the data into KNIME is the first step which has to be accomplished. The output of all parser nodes is a data table consisting of one column with DocumentCells. Each DocumentCell contains one document. This list of documents can then be used as input by all nodes of the enrichment category. The DML and SDML format are XML based formats to represents texts in a structured way. Texts available in other XML based formats can be transformed into SDML easily by the usage of XML nodes, provided by the KNIME XML plugin.


dml.dtd1.63 KB
sdml.dtd1.3 KB

