interquartile function
The interquartile function isolates outliers: for each data observation, it identifies whether it is in the interquartile deviation or not.
It returns the column with a binary True/False value according to the interquartile range.
The coefficient (coeff) value is 1.5 by default. If another coefficient is needed, you can write it in the formula.
If the coefficient is:
<1, it is considered very restrictive.
>1, it is considered less restrictive.
If $"att" is in [Q1-coeff*(Q3-Q1), Q3+coeff*(Q3-Q1)] (where Q1 and Q3 are the first and the third quartiles, respectively, and coeff is a parameter fixed by the user), iniqr returns True, otherwise it returns False.
Function and parameters
inIqr(column, coeff)
Parameter | Description |
---|---|
column | It identifies the column to which you want to apply the formula. The column parameter is mandatory. |
coeff | it is a factor fixed by the user to have a more restrictive or a less restrictive result. |
Example
The following example uses the Bike sales dataset.
Description | Screenshot |
---|---|
By default the coefficient is 1.5. | |
The results have changed, according to the new coefficient. |