variance function in the Factory

The variance function returns the variance of the column, evaluated within groups defined by the group parameter if required.

The variance is a measure of dispersion, which displays how a set of values is far from their average value.

The variance function is also available by:


Function and parameters

variance(column, group)

Parameter

Description

column

It identifies the column to which you want to apply the formula. The column parameter is mandatory.

group

It allows you to group the results by a certain column.


Example

The following example uses the Bike sales dataset.

Description

Screenshot

  • In the example here, we want to retrieve the variance of the Profit attribute.

  • We add a new attribute and write the following formula: variance($"Profit")

  • The variance for the Profit attribute is 206013.811

  • If we want to have a more specific analysis, we can add the group parameter to the formula. In the example, we want our results to be grouped by the Country attribute.

  • The formula becomes: variance($"Profit",$"Country")

  • The results are as follows:

    • The variance for Canada is 204225.686.

    • The variance for Australia is 185416.613.