cumSum function in the Factory

The cumSum function returns the cumulative sum of the column, which is a sequence of partial sums of all the values in the rows before the current one, evaluated within groups defined by the group parameter if required.


Parameters

cumSum(column, group)

Parameter

Description

column

The attribute used to evaluate the cumulative sum. Its type must be numeral. The column parameter is mandatory.

group

It allows you to group the results by a certain column’s values.


Example

The following example uses the Bike Sales dataset.

Description

Screenshot

In this example, we want to calculate the cumulative sum of the Revenue attribute.

Add a new column and type the following formula: cumSum($"Revenue")

  • Moving on during our analysis, we would like to group the cumulative sum by the Country values, so the formula will become: cumSum($"Revenue",$"Country")

  • The results can be read as follows:

    • The cumulative sum of the Revenue in Canada up to row 1 is 950.

    • The cumulative sum of the Revenue in Canada up to row 2 is 1900.

    • The cumulative sum of the Revenue in Australia up to row 3 is 2401, the same value of Revenue, as it is the first occurrence for the Australia group.

    • The cumulative sum of the Revenue in Australia up to row 4 (second occurrence of the Australia group) is 4489.