fillUp function in the Factory
The fillUp function returns a copy of the specified column, filling all the missing values with the next valid value.
Parameters Parameter Description column The values of the specified column will be copied, and all missing values will be filled with the next valid value. Column is a mandatory parameter. group The next valid value can optionally be selected from a specified group. For example, the next valid cost for a specific product. fillall A Boolean parameter, which, if set to True enables the fillDown function to be performed at the end of the fillUp. This is used to fill in any remaining missing values who do not have a successive valid value in the dataset. By default, fillall is False.fillUp(column, group, fillall)
Example - fillUp(column)
The following examples use the Bike sales dataset.
Description | Screenshot |
---|---|
In the first example, we have a series of missing values in the Day attribute in our dataset, which we want to fill. | |
Here we have inserted a simple formula, whereby each missing value is filled with the previous value available in the Day attribute. The formula to enter in this case is: The last row is empty because there is no successive value to use. This problem can be solved by setting the fillall parameter to True as we will do in a subsequent example. |
Example - fillUp(column, group)
Description | Screenshot |
---|---|
In this second example, we want to fill missing values considering the Year value, so each value will be inserted with a successive value for the same year. The formula to enter in this case is: As you can see, there are numerous missing values still, as there are not successive values with corresponding years in the dataset. We will solve this problem in the next step, by using the fillall parameter. |
Example - fillUp(column, fillall)
Description | Screenshot |
---|---|
In this final example, we want to fill missing values considering the Year value, as before, and then fill the missing values with the fillDown function, which will then search for values with the same value for the year attribute from previous rows. The formula to enter in this case is: As you can see, the missing values have now been filled. |