Analyzing Rules in the Rule Manager

The Rule Manager allows you to inspect, manipulate and optimize a set of rules.

It displays the rules and their conditions in a spreadsheet layout.

It can be used with any of the following LLM tasks that generate rulesets:

The task is divided into two tabs, the Rules tab and the History tab:

  1. In the Rules tab, you can visualize the rules, add new ones and sort or filter them.

  2. In the History tab, you can visualize all the operations performed and cancel them.


Prerequisites


Procedure

  1. Drag and drop the Rule Manager task onto the stage.

  2. Connect a task, which contains the ruleset you want to analyze, to the new task.

  3. Double click the Rule manager task. The left-hand pane displays how many rules have been generated and the percentage of these total rules are currently displayed in the ruleset. This percentage may change if modifications are made, such as applying filters or displaying rules only with selected attributes.

  4. Filter results as described in the table below.

  5. Save and compute the task.


Rule Manager Options (Rules tab)

The Rules tab is divided into three columns, which will be analyzed starting from the left:

Parameter

Description

Filtering/Sort metadata

Select the parameters you want to filter by dragging them onto the Pre-filter area. The Pre-filter area works the same way it works in the Data Manager, go to the Pre-filtering Data in the Data Manager page to know more about all the filter options available.

Rules attributes

Visualize the input and output attributes.

Order by

Sort attributes by attribute (as in the dataset), name, type, ignored or role.

Pre-filter

Filter the ruleset that will be used in the output by:

  • #Conditions - specify the number of conditions each rule can contain.

  • Covering - specify the covering percentage that filtered rules must respect.

  • Error - specify the error percentage that filtered rules must respect.

  • Attribute - specify the input attribute(s) by which filter the conditions. Only the conditions containing the specified attribute(s) will be displayed.

  • Outputs - specify the output attribute(s) by which filter the outputs. Only the outputs containing the specified attribute(s) will be displayed.

  • Rsample

  • Wsample

Filter conditions

Filter conditions so that any attributes that have not been selected in the Select rules containing lists are removed.

Sort conditions by

Sort conditions according to their attributes, covering or error values.

Clear

Click on it to clear the Operations table, removing all the created pre-filter operations, performed since the last time Make persistent was selected, and their effects to the dataset.

Make persistent

Click on it to effectively apply all the query operations and permanently change the dataset.


Results

The rule analysis is divided into three separate spreadsheets, located in the Rules tab:

 

Rules spreadsheet

The Rule spreadsheet, with the generated ruleset, contains the following columns:

  • #Cond: the number of conditions in the rule.

  • Output: the output if that rule is matched.

  • Cond n: the n-th condition.

In this spreadsheet you can modify rules and conditions:

  • To delete a rule, select its row in the table and either click the minus icon above the spreadsheet or right-click and select Delete selected rules.

  • To add new rules either click the plus icon above the spreadsheets or by right-click within the spreadsheet and select Create rule. Once you have specified the output value for your rule, a blank rule will be added to the table to which you will then need to add/append conditions.  

  • To delete a condition select it in the spreadsheet, right-click and select Delete selected condition or simply press Delete. The conditions to be removed can belong to different rules. 

  • To edit a condition either select it in the spreadsheet, right-click and select Edit condition or simply double-click it. The changes you can make depend on whether the condition is an ordered or nominal condition.

  • To add or append a new condition to a rule, double-click an empty cell in the row, or right-click and select Append condition, then select an attribute from the dialog box:

Covering spreadsheet

The Covering spreadsheet, with additional information related to the covering of each rule in the following columns:

  • #Patt.: the number of patterns in the training set with the same output class of the rule.

  • Covering: the percentage of patterns (with the output class of the rule) matched by this rule in relation to the total number of patterns of that class.

  • w\o Cond n: the covering gain that would be obtained for the specific rule by removing the n-th condition.

Error spreadsheet

The Error spreadsheet, with additional information about the error scored by each rule in the following columns:

  • #Patt.: the number of patterns in the training set with the output class different from the output class of the rule.

  • Error: the percentage of patterns (with the output class different from that of the rule) matched by this rule in relation to the total number of patterns where the output class is different from the class of the rule.

  • w\o Cond n: the error increase that would be obtained for the specific rule by removing the n-th condition. 


How to read the Covering and the Error spreadsheets - example

The following example uses the Adult dataset.

For each generated rule the number of patterns (that is the number of rows with the same output) a covering value is associated. The following columns (w/o Cond 1, w/o Cond 2 and so on) indicate the covering if the corresponding condition is removed. Considering this constraint, each time a condition is removed, the other covering values must be calculated again, as they have considered the initial conditions.

In this example, the Rule 1 has a 40.6% covering, meaning that the rule considers 40.6% of the rows with the output sex=female. w/o Cond 1 has a 2.53% value, while w/o Cond 2 has a 54.34% value.

If we remove Cond 1 we will obtain the following covering value: 40.6%+2.53%= 43%; subsequently, the Cond 2 will be calculated as follows: 100%-43%= 67% (because there is only one condition left after the Cond 1 removal)

If we remove Cond 2 we will obtain the following covering value: 40.6%+54.34% = 95%; subsequently, the Cond 1 will be calculated as follows: 100%-95%=5% (because there is only one condition left after the Cond 2 removal)

The logic explained above is valid for rules containing two or more conditions.

In row 2, which indicates rule 2, we only have one condition, so the sum between covering and w/o Cond 1 is 100%.

This logic can be applied to the Error spreadsheet as well.