Pre-filtering Data in the Data Manager

Setting Pre-filters allows you to select a subset of cases that satisfy a condition, before grouping, sorting or applying a formula to them.

There are several icons displayed in the pre-filter, which are useful to understand before applying filters:

  • The filter type icon: located at the left side of the Pre-filter area, it varies according to the filter type chosen. Possible options are:

    • And: used in filter conditions to specify that all filter conditions must be satisfied.

    • Or: used in filter conditions to specify that at least one filter condition must be satisfied.

  • Make persistent: located on the right side of the query panel. Click on it to effectively apply all the query operations and permanently change the dataset.

  • Clear: located on the right side of the query panel. Click on it to clear the query table, removing all the created query operations, performed since the last time Make persistent was selected, and their effects to the dataset.


Prerequisites

  • You must have created a flow;

  • You must have linked the Data Manager to a task which contains the data to work on.


Procedure

  1. Select whether you want to satisfy all the conditions simultaneously (and) or only one of them (or). This choice can be made also after having set the attribute’s filters by clicking on the and/or button in the Pre-filter box.

  2. Drag your attributes onto the Pre-filter area.

  3. Set your pre-filter options, as described in the filter options table below.

  4. Click Apply.

  5. Click either:

    1. Make persistent to save your results and make all the query operations permanent.
      The pre-filter cell is automatically cleared from the query operations performed, but the query operations are still applied to the table.

    2. Clear to remove query operations from the query panel: it cancels all the query operations that haven’t yet been made persistent.

  6. Save and Compute the task.

Filter by:

Details

Values

This option allows you to set conditions on the values of the selected attribute.
Depending on the attribute type (nominal or ordinal) different filter options are displayed.

  • In ordinal attributes, you must start defining your filter by selecting a Right Operator and then fill the Right Hand Side box by manually entering a value or choosing an average value. If necessary, perform the same operations in the Left Operator and Left Hand Side box.

  • In nominal attributes, the filter panel has a different layout, which fits better the attribute’s characteristics:

    • you can choose an Operator (in if you want to keep the selected values or not in if you want to remove the selected values);

    • you can speed up filtering operations by using the Check all (to select all the values), Uncheck all (to remove all selections) and Revert (to invert the selection) buttons.

    • through the Search text box, you can look for specific values, which must be selected or unselected in the list. The search can also be Case Sensitive, by selecting the checkbox. This operation is very useful if we have a lot of values and we need to perform filter operations only in specific ones.

Attributes

This option allows you to filter your values based on its relationship with another attribute.

You can filter values by selecting the operator you want to use to filter your data, then select the attribute from the drop-down list. For ordinal values you can also select a range by also specifying the left-hand operator.

Code

This option allows you to filter your values via a custom formula or via variables. First, you need to select the required operator to filter your data.

When filtering via a formula, enter the formula and open a bracket, following the syntax below:

  • When specifying the attributes, the syntax is $"attributename"

  • When specifying an attribute value, the syntax is 'value'

To specify a range of ordinal values, successively select a left-side operator and specify the required values in the same way.

You can decide whether to consider the None values or not by clicking on the corresponding checkbox.

To know which formulas can be used in the code filter go to the Formulas and Functions page.

Remember that it is mandatory to follow the formula’s syntax (including capital letters), otherwise an error is returned.

You can also use variables to filter your data, by writing a @ followed by the variable’s name.

  • To view a configured filter, right-click the attribute in the pre-filter area and select Open.

  • To disable right-click the attribute in the pre-filter area and select Disable. If the filter is disabled, the attribute will result in a blurred image and the dataset won’t be filtered.
    To re-enable the filter, right-click the attribute in the pre-filter area and select Enable - this is useful if you want to enable only the selected querying operation.

  • To delete a filter, right-click the attribute in the pre-filter area and select Delete.

  • To build complex filters with the Indent Selection feature see the page Building complex pre-filters & post-filters in the Factory.


Operators available for ordered attributes

Operator

Meaning

==

Equal to

!=

Not equal to

<

Less than

<=

Less than or equal to

>

Greater than

>=

Greater than or equal to


Example 1 - Pre-filter - Values

The following example uses the Supermarket_sales dataset.

Step

Screenshot

  • Drag the attribute onto the pre-filter area.

  • In the example here, we decided to filter the Product line attribute.

  • Set your filter by checking the values you need and helping yourself with the Check all, Uncheck all, Revert options that are available by hovering over the plus button.

  • In the example, we want to filter the Home and lifestyle value, None Values excluded.

This is the result after you have pre-filtered the dataset. All the values containing Home and lifestyle are shown.


Example 2 - Pre-filter - Attributes

The following example uses the E-commerce shipping data dataset.

Step

Screenshot

  • Drag the attribute on the pre-filter area. Select Attributes and set the filtering parameters. Then, click Apply.

  • In the example here, we decided to filter the Customer_care_calls attribute and display only the rows where they are more than the Prior_purchases.

This is the final result after you have pre-filtered the Customer_care_calls attribute.


Example 3 - Pre-filter - Code

This example uses the Adult dataset.

Step

Screenshot

  • Drag the attribute you want to pre filter onto the pre-filter area. Select Code and set the filtering parameters. Then, click Apply.

  • We decided to filter the age attribute’s values that are less than their mean. We chose < as right operator and then we wrote mean($"age") in the Right Hand Side text box.

This is the final result after you have pre-filtered the age attribute.