median function in the Factory

The median is the middle value in a list of values within an attribute.

For example, in the series 5 - 10 - 77 - 320 - 1, the median is 77.

The median is different from the mean. In fact, the mean corresponds to the mathematical average value of an attribute.

The median function is available also by:

Function and parameters

median(column, group)




It identifies the column to which you want to apply the formula. The column parameter is mandatory.


It allows you to group the results by a certain column.


The following example uses the Students Performance dataset.



  • In the example here, we want to retrieve the median value of the math score attribute. We type the following formula:

  • median($"math score")

The median of the math score attribute is 66.

  • Then, we want to group our results by the lunch attribute, so the formula will be:

  • median($"math score",$"gender")

  • The results are as follows:

    • The median value of the math score for female students is 65.

    • The median value of the math score for male students is 69.