fillDown function in the Factory

The fillDown function returns a copy of the specified column, filling all the missing values with the last valid value, optionally grouped by selected attributes. If there are rows which remain empty, as they do not have a previous valid value, a fillUp operation can then be performed, using the next valid value.


Parameters

fillDown(column, group, fillall)

Parameter

Description

column

The values of the specified column will be copied, and all missing values will be filled with the last valid value. Column is a mandatory parameter.

group

The last valid value can optionally be selected from a specified group. For example, the last valid cost for a specific product.

fillall

A Boolean parameter, which, if set to True enables the fillUp function to be performed at the end of the fillDown. This is used to fill in any remaining missing values who do not have a previous valid value in the dataset.

By default, fillall is False.


Example - fillDown(column)

The following examples use the Bike sales dataset, from which we have removed some values in the Day attribute.

Description

Screenshot

In the first example, we have a series of missing values in the Day attribute in our dataset, which we want to fill.

Here we have inserted a simple formula, whereby each missing value is filled with the previous value available in the Day attribute.

The formula to enter in this case is: fillDown($"Day").

The first row is empty because there is no previous value to use. This problem can be solved by setting the fillall parameter to True as we will do in a subsequent example.


Example - fillDown(column, group)

Description

Screenshot

In this second example, we want to fill missing values considering the Year value, so each value will be inserted with a previous value for the same year.

The formula to enter in this case is: fillDown($"Day", $"Year")

As you can see, there are numerous missing values still, as there are not previous values with corresponding years in the dataset.

We will solve this problem in the next step, by using the fillall parameter.


Example - fillDown(column, fillall)

Description

Screenshot

In this final example, we want to fill missing values considering the Year value, as before, and then fill the missing values with the fillUp function, which will then search for values with the same value for the year attribute from following rows.

The formula to enter in this case is: fillDown($"Day", $"Year", True)

As you can see, the missing values have now been filled.