Using Similar Items Detector to solve Association Problems
Rulex generates description-based and sales-based replacement rules with the Similar Items Detector task.
This task uses description-based matching, which can be used with newly introduced items and helps solve cold start problems.
you must have created a flow;
the required datasets must have been imported into the flow;
the data used for the analysis must have been well prepared;
a Frequent Itemsets Mining task must be present in the flow and provide input data for the Similar Items Detector.
The results of the task are displayed in two separate tabs:
The Replacement rules tab displays the generated item sets, where:
Rule Replacement ID: the sequential ID number for replacement rules.
Replaced item ID: IDs of replaced items
Replacing item ID: IDs of replacing items
The Results tab displays details on the execution of the analysis, where:
Task Identifier: the ID code for the task, internally used by the Rulex engine.
Task Name: simply the name of the task.
Elapsed time (sec): the time required for latest computation (in seconds).
Number of generated replacement rules: the number of replacements rules which were generated by the task.
Drag the Similar Items Detector task onto the stage.
Connect a task that contains frequent itemsets to the new task.
Double click the Similar Items Detector task. The left-hand pane displays a list of all the available attributes in the dataset, which can be ordered and searched as required.
To generate description-based replacement rules, click on the Text based matching tab and configure the options as described in the table below.
To generate sales based-replacement rules, click on the Sales based matching tab and configure the options as described in the table below.
Save and compute the task.
Similar Items Detector options
Text based matching options
Select the attribute that represents the category from the drop-down list. This can be used to match only descriptions that belong to the same category.
Select the attribute that represents the description from the drop-down list, which will be used for text matching.
Select how words are separated from one of the following possibilities:
Minimum word length
Words that are shorter than the value entered here will not be used for text matching. This helps to eliminate words such as the, a, one, at etc.
Minimum unadjusted similarity cosine
The minimum similarity of pure text matching, without considering Preferential requirements attributes.
Entering 1 means the text must be identical, 0 corresponds to no match required.
Case sensitive matching
If selected, the upper or lower case will be taken into consideration when matching text.
Item key attributes
Drag and drop the nominal attributes that uniquely identify the item from the Attributes list. Instead of manually dragging and dropping attributes, they can be defined via a filtered list.
Preferential requirements attributes
Drag and drop the attributes which will influence the similarity score when they match. When they match, a weight is added to the similarity score. This weight is defined in the Preferential requirements weights.
These attributes could, for example, define brand, packaging or size.
Instead of manually dragging and dropping attributes, they can be defined via a filtered list.
Ignored char list
Select the characters you want to eliminate from text matching.
Preferential requirements weights
The weight awarded to matching Preferential requirements attributes.
Sales based matching options
Takes also sales data into account
Select this option to include sales data in the task execution.
Minimum alternativeness coefficient
The degree of alternativeness between the purchase of two items:
If a pair of items ensures the Minimum alternativeness coefficient, the corresponding replacement rule is discarded.
Minimum volume replacement score
The minimum percentage of orders in which a replaced item is expected to be replaceable by the replacing item. If this minimum threshold is not satisfied by a replacement rule, it is discarded.