Once this has been completed and implemented, the actual algorithm for equidistant binning can be written. The algorithm operating on the data must be placed in the execute method. In this example only one column is appended to the original data. For this purpose the so-called
Having created the
For purposes of the
ColumnRearranger
is used. It requires a CellFactory
, which returns the appended cells for a given row.
... // instantiate the cell factory CellFactory cellFactory =new
NumericBinnerCellFactory( createOutputColumnSpec(), splitPoints, colIndex); // create the column rearranger ColumnRearranger outputTable =new
ColumnRearranger( inData[IN_PORT].getDataTableSpec()); // append the new column outputTable.append(cellFactory); ...
ColumnRearranger
, it can be transferred together with the input table to the ExecutionContext
to create a BufferedDataTable
which is returned by the execute
method, i.e. provided at the outport. Each node buffers the data in a BufferedDataTable
. In order to avoid redundant buffering of the same data the ColumnRearranger
is used. In this way only the appended column is buffered in our node. That is why we have to retrieve the BufferedDataTable
from the ExecutionContext
:
...
// and create the actual output table
BufferedDataTable bufferedOutput = exec.createColumnRearrangeTable(
inData[IN_PORT], outputTable, exec);
// return it
return new
BufferedDataTable[]{bufferedOutput};
...
CellFactory
it is necessary to implement a NumericBinnerCellFactory
. This extends the SingleCellFactory
and only implements the getCell
method. The passed row is checked to find out which bin contains the value from the selected column. It returns the number of the bin as a DataCell
.
/** * @see org.knime.core.data.container.SingleCellFactory#getCell( * org.knime.core.data.DataRow) */ @Overridepublic
DataCell getCell(DataRow row) { DataCell currCell = row.getCell(m_colIndex); // check the cell for missing valueif
(currCell.isMissing()) {return
DataType.getMissingCell(); }double
currValue = ((DoubleValue)currCell).getDoubleValue();int
binNr = 0;for
(Double intervalBound : m_intervalUpperBounds) {if
(currValue <= intervalBound) {return new
IntCell(binNr); } binNr++; }return
DataType.getMissingCell(); }