textExtract function in the Factory

The textExtract function returns the string ranging from a defined starting position to defined ending position.


Parameters

textExtract(column,startpos,endpos)

If you are using continuous attributes, check the Flow Execution Parameters.

Parameter

Description

column

The nominal attribute used to extract the substring. The column parameter is mandatory.

If it is not nominal, it will be casted to nominal upon function’s computation.

startpos

The position of the letter where the extraction starts (the first letter is 1). It can be either a specified value or an integer attribute. The startpos parameter is mandatory.

endpos

The position of the letter where the extraction ends. It can be either a specified value or an integer attribute. The endpos parameter is mandatory.


Example

The following example uses the Adult dataset.

Description

Screenshot

In this example, we want to retrieve a subset of the string contained in the fnlwgt attribute.

To achieve this goal we’re going to use the following formula:

textExtract($"fnlwgt",3,5)

3 is the starting position of the string we need, and 5 is the ending.