replace function in the Factory


The replace function replaces the current strings of the values in the column with the new ones.


Parameters

replace(column, oldvalue, newvalue, ntimes)

If you are using continuous attributes, check the Flow Execution Parameters.

Parameters

Description

column

The nominal attribute containing the strings of the values to be replaced. If it is not nominal, it will be casted to nominal upon function’s computation. The column parameter is mandatory.

oldvalue

The value to be replaced within the column. The oldvalue parameter is mandatory.

newvalue

The new value which replaces the oldvalue. The newvalue parameter is mandatory.

The newvalue parameter is case sensitive.

ntimes

If 0, or not specified, all the occurrences of the oldvalue within all the attribute’s values will be replaced.

If a positive number is specified, it indicates the number of times the newvalue will replace the oldvalue starting from the beginning of all the values within the attribute.

If a negative number is specified, it indicates the number of times the newvalue will replace the oldvalue starting from the end of all the values within the attribute.


Example

The following example uses the Adult dataset.

Description

Screenshot

In this example, we want to replace the Female values of the sex attribute with F, and the Male values of the sex attribute with M.

Add a new attribute, called replace, and type the following formula first:

replace($"native-country",'United-States','USA')

As we didn’t specify the ntimes parameter, all the Female values of the column are replaced.

Then, select again the replace attribute and cancel the formula in the formula bar.

Then, type: replace($"native-country",'t','@',2)

As you can see, for each occurrence of the ‘United-States’ value the first two occurrences of the 't' are replaced by the @.