Indexing & Searching
This feature provides high-performance indexing and searching of large and complex KNIME data tables. The feature bases on the open-source search engine library Apache Lucene which is used in many projects.
IndexingThe indexing infrastructure allows the indexing of KNIME data tables and supports various data types. Besides basic types such as string, integer and floating point numbers it also supports more complex types such as collections, networks and documents. By supporting collection cells the feature allows searching for collection cells that contain specify values. Network indexing allows searching for a particular node or edge in all networks within a KNIME data table. In addition to index the full text of a document, which allows to search for documents in a KNIME table containing certain terms, the documents meta data is also indexed. Using the meta data the search results can be further filtered by particular authors, journals and publication date.
SearchingOnce the index has been created it can be queried using a powerful query language. The query syntax bases on the Lucene query syntax and supports among others phrase queries, wildcard queries, fuzzy queries, proximity queries, range queries, term boosting, grouping and boolean operators. Each data column of a KNIME data table can be searched separately or in conjunction using boolean operators. This allows the expression of complex queries spanning all columns of the indexed KNIME data table.
Table Indexer The Table Indexer node creates an index from a KNIME data table. It creates for each table row a document in the index with index fields representing the columns. This allows to search for values in a certain table column. The node also supports the storing of the original data in the index document in order to recreate the original row from a result document when querying the index.
Index Query The Index Query node allows querying a given index. The query syntax bases on the Lucene query syntax which supports among others wildcard queries, fuzzy queries, proximity queries, range queries, boolean operators and thus the assembling of advanced queries.