Reshaping Datasets to Wide Format
The reshape to wide pivoting operation reshapes a dataset by:
transforming a key attribute into a new set of attributes for each row
creating a new column for each distinct value of the transformed key attribute.
Consequently the number of keys in the dataset will be decreased, which is often a prerequisite for merging datasets.
Prerequisites
You must have created a flow;
The required datasets must have been imported into the flow.
Procedure
Drag and drop the Reshape To Wide task onto the stage.
Connect a task that contains the attributes you want to transform to the Reshape To Wide task.
Double click the Reshape To Wide task. The left-hand pane displays a list of all the available attributes in the dataset, which can be ordered and searched as required. Select the attributes that will be displayed in the final table (othnames), by checking the corresponding box (by default all the attributes are included). Notice that, since the number of rows in decreased by the Reshape To Wide task, if you check an attribute without dragging and dropping it into the Widened attributes list, in the final table you will have only the value contained in the first line for each value of the Key Attributes.
Configure the task options as described in the table below.
Save and compute the task.
Reshape to Wide options | |
Name | Description |
---|---|
Key attributes | Drag and drop the attributes that will be used as a key to identify each group of records. A record/row will be created for each distinct set of values of the key attributes. Instead of manually dragging and dropping attributes, they can be defined via a filtered list. |
Long attributes | Drag and drop the attributes that will be become column headers in the wide format. Instead of manually dragging and dropping attributes, they can be defined via a filtered list. Note that the combination of Key attributes and Long attributes should be a unique key in the original table. If this is not the case, you have to group according to Key attributes and Long attributes in a Data Manager before applying Reshape To Wide. |
Widened attributes | Drag and drop the attributes that will become column values in the resulting data table. Instead of manually dragging and dropping attributes, they can be defined via a filtered list. |
Fill missing values | If selected, zeros will be inserted in every empty cell of the wide columns in the new reshaped table (empty cells in the final table correspond to the combinations of key/long attributes not present in the long dataset). |
Remove prefix from widened attribute name | If selected, the prefix "Value" is removed from widened attributes to avoid having the same initial attribute name repeated for every new column in the table. By default, the new columns are named Value(long_value) where long_value is a possible value for the long attribute. |
Example
The following example uses the Northwind_orderdetails dataset.
Description | Screenshot |
---|---|
| |
| |
|