Getting data in the right structure is always a challenge for a data analyst. Wisdom says that we spend around 80% of our effort in getting the data set in the proper structure before we start with our analysis. In my experience, I have always wished for data that is raw and not processed or summarized. But quite often what I have received data in the form of a summarized table, something like this:
How convenient is it to apply formulas for YTD, or month-over-month variation for the table below?
Download Sample File
Data table structure like above is convenient for reports and presentations, but not for calculation. For calculations, the following table is helpful:
Such transformation is also known as Unpivoting the table(restructuring data from columns to rows). As it is the opposite of creating a pivot table from the table in image 2
When to Unpivot?
When the data table is in “report” format, and it requires transformation into a regular data table format. In other words, when we require the headers and their corresponding values in two separate columns.
How to Unpivot a table?
Unpivoting is simple if we understand the required output.
Select all the columns which require unpivoting:
Alternatively, select the columns which do not require unpivoting > Right Click > Unpivot Other Columns
Benefits of Unpivoting
Applying filters, writing DAX queries, adding dimensions as slicers become an easier task.
Pivot Transformations
Sometimes, the raw data is also not helpful and requires transformation before we start the analysis. Let us consider the following example:
Such tables could be a system output and not very helpful. In the table above, four rows of data constitute one record.
We wish it in the following structure:
For such transformations, we use Pivot table feature in the Power Query (restructuring data from rows to columns)
Step 1: Grouping records
Since each record consists of four rows, the first objective is to group the records in a set of four.
Add an index column > Add Column > Index Column
Select the Index Column > Transform > Standard > Integer Divide
Since we need a group of four, we divide it by 4
Index column divided in a set of 5 groups (0, 1, 2, 3, 4)
Step 2: Pivot Transformation
Select the Column 1 (which we need as headers) > Transform > Pivot Column
Under Values Column > Select Column 2
Advanced Options: > Don’t Aggregate > Click Ok
Following is the output:
Now assign the appropriate data type to Date of Joining (as Date) and CTC (as Decimal)
We may also get rid of the Index column as it is not required.
You may watch the video for the Pivot exercise:
YouTube video
Also see: Table Transformation with Power Query
Kommentare