median function in the Factory

The median is the middle value in a list of values within an attribute.

For example, in the series 5 - 10 - 77 - 320 - 1, the median is 77.

The median is different from the mean. In fact, the mean corresponds to the mathematical average value of an attribute.

The median function is available also by:


Function and parameters

median(column, group)

Parameter

Description

column

It identifies the column to which you want to apply the formula. The column parameter is mandatory.

group

It allows you to group the results by a certain column.


Example

The following example uses the Students Performance dataset.

Description

Screenshot

  • In the example here, we want to retrieve the median value of the math score attribute. We type the following formula:

  • median($"math score")

The median of the math score attribute is 66.

  • Then, we want to group our results by the lunch attribute, so the formula will be:

  • median($"math score",$"gender")

  • The results are as follows:

    • The median value of the math score for female students is 65.

    • The median value of the math score for male students is 69.