Ranking Rule Features and Values

The Feature Ranking task is a graphic visualization of the importance of attributes within a class (attribute ranking), and of the values within specific attributes (value ranking). 

The task can be used with any task that generates rulesets, such as:


Prerequisites


Procedure

  1. Drag the Feature Ranking task onto the stage.

  2. Connect a task, which contains the ruleset you want to analyze, to the new task.

  3. Double click the Feature Ranking task. 

  4. Configure the options described in the Feature Ranking options table below.

Feature Ranking options

Parameter Name

Description

Attributes

The attributes present in the rules for each class, ordered according to the Order attributes by option.

The attribute selected here will determine which attribute is displayed in the Value Ranking plot.

Displayed relevances

You can decide whether you want to display plots that refer to:

  • all possible output values (Absolute), or

  • to a single class (Relative).

This option is only available for nominal output values.

Enable multi-plot

If checked, a plot is displayed for each relevance selected in the Displayed relevances option.

This option is only available for nominal output values.

Interval for output

You can select an interval of output values to be included in the Attribute Ranking plot.

This option is only available for ordered output values.

Order attributes by

You can select the criterion for sorting the list of attributes. Possible choices are by:

  • Relevance (default)

  • Attribute Index

  • Name (alphabetical order).

  • Type (first Nominal attributes, the Discrete and Continuous).

This option is applied to the Attribute Ranking plot.

Order values by

You can select the criterion for sorting the values of each attribute. Possible choices are by:

  • Relevance

  • Value (ascending order for numerical attributes, or alphabetical for nominal attributes).

This option is applied to the Value Ranking plot.

Number of displayed attributes

Select the number of attributes you want to include in the Attribute Ranking plot.

Number of displayed values

Select the number of attributes you want to include in the Value Ranking plot.

Order by absolute values

If selected, relevances are ordered according to their absolute value. This is meaningful if have decided to display relative relevances which may also have negative values.

Order relevances using absolute values

Selected by default, it allows to display the relevance of attributes ordered by their absolute values.

This option is applied to the Attribute Ranking plot.


Example

The following example uses the Adult dataset.

Description

Screenshot

After having imported the dataset through an Import from Text File task and after having split it into training (70%) and test (30%) sets with a Split Data task, add a Logic Learning Machine (classification) task to the flow and define the attribute Income as output.

Save and compute the task, then add a Feature Ranking task and open it.

We have included 14 attributes in the plot.

From the Attribute Ranking plot we can easily see that the education variable is the most important attribute in determining the output.

If we decide to display only the attributes related to an output <=50K the plot changes noticeably, and also contains negative values, indicating that the attribute is inversely correlated with that output value.

If the Order by absolute values option is selected, attributes are sorted according to the absolute value of relevance.

Clicking on the Value Ranking tab you can view the relevance of each interval, for selected attributes.

In the example the relevances are displayed for the age attribute, in crescent order of importance.