Column Expressions for Data Manipulation

The Column Expressions node is useful because it can perform multiple data manipulation tasks at once. You can replace combinations of other data manipulation nodes, such as the String Manipulation, Math Formula, and Rule Engine nodes, with this single node.

Here you will learn how to configure the Column Expressions node by writing one or more expressions for data manipulation.

You can find the reference workflow on the KNIME Hub.

Exercise

Read the sales.csv dataset.

  • Write an expression that extracts the first three letters of country names and converts them to upper case letters. Append a new column and name it “Country_Code”.
  • Write an expression that multiplies the sales amount by the conversion rate. Replace the “amount” column, but change its type to double.
  • Write an expression that assigns the value “N” to the missing values in the “card” column. Replace the “card” column.
Solution

Extract the first three letters of the country names using this expression:

substr(column("country"), 0, 3)

Use the upperCase() function to convert the substring to upper case letters.

Disable the “Replace Column” field and write “Country_Code” in the “Output_Column” field.

Multiply “amount” and “conversion” columns using this expression:

column("amount")*column("conversion")

Select “Number (double)” in the menu in the “Type” field. 

Assign the value “N” to missing values in the “card” column using this expression:

a = column("card")

if(isMissing(column("card")))

a = "N"

You can download the solution workflow here.


* The link will open the workflow directly in KNIME Analytics Platform (requirements: Windows; KNIME Analytics Platform must be installed with the Installer version 3.2.0 or higher)

LinkedInTwitterShare