Applying Python Scripts in Rulex Flows

The Python Bridge task allows you to perform statistical calculations via Python script on Rulex data, and either overwrite the original dataset with the output results, or create a new dataset (such as clusters or advanced association structures).

This type of task may be useful when you already have statistical algorithms in Python, and want to use them in Rulex without having to rewrite any of the logic.

The Python script can either be entered directly in the task, or referenced through an external script file.

Python dictionary format

The format data in python is a dictionary that associates to each column name (key) the list of values of the column itself (values), for example:

"age" : [39, 45,...] "workclass" : ["Private", "State-gov",...]


Prerequisites

  • you must have created a flow;

  • Python 3 software has been installed where Rulex is running.

  • IPython3 has been installed on the machine where Rulex is running (via pip install ipython)

  • Miniconda must have been installed on the machine where Rulex Factory is running.

We strongly recommend you to use Python 3.10. After having installed Miniconda, check on your machine that Python 3.10 has been installed. If not, install it manually.


Procedure

  1. Drag the Python Bridge task onto the stage.

  2. Connect the task that contains the dataset on which you want to perform the Python script to the Python Bridge task.

  3. Double click the Python Bridge task.

  4. Configure the script options as described in the table below.

  5. Save and compute the task.

Python bridge options

Name

Description

Advanced Configuration options

Select executable type

Select the executable type you want to use in the task. The options available are:

  • Conda

  • Python

The Configuration tab changes according to the selected executable.

Executable file area

Drag the Python executable file on this area if you have chosen Python as the executable file type, or drag the Conda executable file here if you have chosen Conda as the executable file type. You can browse on your machine to upload the file by clicking on the Select button.

Configuration tab

Use user environment

Select this checkbox if you want to use the environment created by Rulex. This option is available only if Conda has been chosen as the executable file type.

Select Conda Environment

In this drop down list, choose the Conda environment you want to use in the task. This option is available only if Conda has been chosen as the executable file type.

Console tab (available for both Python and Conda executable files)

Connect Python Bridge

Click on it to open the Interactive Console, where you can write the Python code.

Interactive Console

Here you can write the code which will be executed within the task.

The dataset is saved as r_dataset onto the task. So every time you need to write code referring to the dataset, use this reference in the Console or in the Script tabs.

The dataset is saved in a variable python dataframe. The flow variables are saved as r_vars.

Save History

Click on this button to save the changes made to the code. The code will be printed in the Script.

Clear History

Click on this button to delete all the changes made to the code.

Script tab (available for both Python and Conda executable files)

Python Editor

Here you can visualize and edit the Python code, just like in the Interactive Console in the Console tab.

Execute Code

Click this button to execute the code and visualize the results in the Last Execution Output area.

Save Code

This button saves the code written in the Script area.

Last Execution Output

Here, you can visualize the output of the last code execution.