quantile function

The quantile function returns the specified quantile of the column, evaluated within groups defined by the group parameter if required. A column of weights can also be defined.

Quantiles are cut points dividing a range of probability distribution into intervals with equal probabilities.


Function and parameters

quantile(column, quant, group, weights)

Parameter

Description

column

It identifies the column to which you want to apply the formula. The column parameter is mandatory.

quant

It is the percentile. The quant parameter is mandatory.

group

It allows you to group the results by a certain column.

weights

It defines the importance of a certain attribute.


Example

The following example uses the HR-employee-attrition dataset.

Description

Screenshot

  • In the example here, we want to retrieve the quantile of the Years at Company attribute.

  • We write the following formula: quantile($"YearsAtCompany",0.5)
    The 0.5 parameter indicates intervals from 0 to 5.

  • The quantile for the Years At Company attribute with a 0.5 percentile is 5.

  • If we want to be more precise in our analysis, we can decide to group our results by a certain attribute’s values and to add more weight to another attribute.

  • For example, we want our values to be grouped by the Gender attribute, and that the Job Role attribute has more weight. The formula becomes:

  • quantile($"YearsAtCompany",0.5,group=$"Gender",weight=$"JobRole")

In this case, the results don’t change.