Finding and Replacing Values in Datasets
Sometimes it may be necessary to find and replace values that create a specific undesired outcome.
For example, imagine you have a dataset with information on parcels that need to be posted, and any parcels that weigh more than 4kg must be sent by courier. If this rule is not respected, an "overweight, cannot be sent" outcome is produced. When applied to a new dataset, the Rule Based Control task corrects these errors in the dataset, consequently modifying the way the parcel will be sent.
The task performs this operation according to the weights specified for each outcome class, and modifying the attributes that have been specified as modifiable according to selected correction methods, such as by calculating the overall mean or median or the mean or median for specified key attributes.
Prerequisites
you must have created a flow;
the required datasets must have been imported into the flow;
the ruleset and a dataset on which the rules can be applied must come before the Rule Based Control task in the flow. The ruleset can also be imported from a different Rulex flow.
Procedure
Drag the Rule Based Control task onto the stage.
Connect a task, which contains the ruleset and data you want to modify, to the new task.
Double click the Rule Based Control task. The left-hand pane displays a list of all the available attributes in the dataset, which can be ordered and searched as required.
Configure the options described in the table below.
Save and compute the task.
Rule Based Control options | |
Parameter Name | Description |
---|---|
Key attributes | Drag and drop here the attributes for which calculations will always be performed. Instead of manually dragging and dropping attributes, they can be defined via a filtered list. |
Attributes that cannot be modified | Drag and drop here any attributes that must not be changed. Instead of manually dragging and dropping attributes, they can be defined via a filtered list. The attributes specified here cannot be modified, even if they are added to the Attributes to be corrected list. Any attributes included in the ruleset can be modified, unless added here, even if they are added to the Attributes to be corrected list. These attributes include those NOT covered in the ruleset. |
Attributes to be corrected | Drag and drop here any attributes that can be modified. Any attributes included in the ruleset can be modified, unless included in the cannot be modified list. These attributes include those NOT covered in the ruleset. Instead of manually dragging and dropping attributes, they can be defined via a filtered list. |
Undesired output weight | Specify a weight for each outcome class. Assign a higher value to the outcome you want most to avoid. |
Correct values for ORDERED attributes with | Select how you want to correct ordered values that have caused an undesired outcome class. Calculations can be performed on all values, or values for specific attributes, if these attributes have been dragged and dropped into the Key attributes list. Possible values are: Mean, Median, Mode, Minimum, Maximum, Minimum change (ordered attributes only). Instead of manually dragging and dropping attributes, they can be defined via a filtered list. |
Correct values for TEMPORAL attributes with | Select how you want to correct temporal values that have caused an undesired outcome class. Calculations can be performed on all values, or values for specific attributes, if these attributes have been dragged and dropped into the Key attributes list. Possible values are: Mean, Median, Mode, Minimum, Maximum, Minimum change (ordered attributes only). Instead of manually dragging and dropping attributes, they can be defined via a filtered list. |
Append results | If selected, the results of this computation will be added to the dataset. Otherwise they replace the results of previous computations. |