leafDistance function in the Factory

The leafDistance function calculates the distance, in terms of number of edges, of each node of the son attribute from its leaf. A leaf is the very last node of the branch.

An edge is the link from one node to another.


Parameters

leafDistance(parent, son, group, whichpath, separator, weights, operator)

Parameter

Description

parent

The attribute containing the parent nodes of a directed graph. The parent parameter is mandatory.

son

The attribute containing the son nodes of a directed graph. The son parameter is mandatory.

group

The attribute by which you want to further group results.

The group parameter can also be defined as a list: leafDistance($"parent",$"son",($"group1", $"group2"))

whichpath

When defined, it allows the user to choose which path is to be considered:

  • the shortest one - whichpath = "minimum",

  • the longest one - whichpath ="maximum" or

  • all the paths - whichpath = "all" - in this case, distances are concatenated in a single string.

If no path is specified, the leafDistance function applies the “minimum” variable by default.

separator

It specifies the separator to use in the concatenation of different distances.

The default separator is "-"

This parameter can only be used when the whichpath = "all" is defined.

weights

The attribute defining the length of the edge.

operator

It defines how to combine weights attributes along the path to the leaf.

If left unspecified, the default operator applied is sum and the weight attributes will be summed. While if the operator is prod, they will be multiplied.

It can only be used if the weights parameter is specified.


Example - leafDistance(parent,son)

The following example uses the Master datadataset.

Description

Screenshot

In this example, we have a supply chain dataset, and we want to calculate the distance from the location (LocFr attribute) to the final customer facing distribution (LocTo attribute).

We can use the leafDistance function which calculates how many steps are to be taken to reach the leaf of a directed graph.

We need to define the parent and son parameters, which are respectively the LocFr and LocTo attributes.

So the formula would be: leafDistance($"LocFr",$"LocTo")


Example - leafDistance(parent, son, group)

The following example uses the Master datadataset.

Description

Screenshot

In this other example, we want to calculate the distance from the location to the final customer facing distribution center for each product. If the location is a customer facing distribution center, the result will be 0. 

To achieve this goal we need to use the leafDistance function, define the parent and son parameters, which are respectively the LocFr and LocTo attributes, and group the results by Product.

So the formula would be: leafDistance($"LocFr",$"LocTo",$"Product")

In this way the leafDistance function has calculated the distance in steps from LocFr to LocTo for each product.