catNames Function in the Factory

The catNames function searches for values in specific attributes, and returns the headers of the attributes where the values were found. All the corresponding headers are concatenated.

This function is useful for finding values in very large datasets.


Parameters

catNames(indatt, values, separator, negate)

Parameter

Description

indatt

The attributes whose values we want to search in. Multiple attributes must be included in square brackets, such as [“["hours-per-week","education-num"]". The indatt parameter is mandatory.

The dollar sign is not required before the names of attributes.

values

The values which will be searched for in the attributes indicated in the indatt parameter. The values parameter is mandatory.

separator

The required separator to divide the returned headers. Symbols or strings can be used, for example separator = "and".

The default value is “-”.

negate

If set to True instead of returning the headers of attributes where the value is present, the function will return the headers of attributes where the specified value is not present.

The default value is False.


Example

The following example uses the adult dataset.

Description

Screenshot

In the adult dataset, we are looking for the value “13” in the two attributes hours-per-week and education-num.

We have added a new attribute, called Concatenate, to contain the results.

The formula will consequently be: catNames(["hours-per-week","education-num"],13)

We will then change the default “-” symbol to concatenate values to “,”.

The formula will now be : catNames(["hours-per-week","education-num"],13, separator=",")

In this last example, we want to return the headers of the attributes where the value 13 is not present.

The formula will now be: catNames(["hours-per-week","education-num"],13, separator=",", negate = true)