Summary: in this article, you will learn about slowly changing dimensions type 1, type 2, and type 3 and corresponding techniques to deal with each of them.
In dimensional modeling, it is essential to determine how the change of data in the source system reflects in dimension tables in the data warehouse system. This phenomenon is known as slowly changing dimensions. This term comes from the reason that dimensions accumulate their changes at a slow rate compared to facts in the fact table.
There are three most common slowly changing dimension types: slowly changing dimension type 1, type 2, and type 3. Let’s examine each type and technique to handle the situation in greater detail.
Slowly changing dimensions type 1
In slowly changing dimensions of type 1, when data in the source system changed, the corresponding dimension attribute is overwritten. As a result, the dimension table does not represent the history, and the historical context of existing facts is changed.
Slowly changing dimensions type 1 should be avoided as much as possible. Slowly changing dimensions type 1 should be used only in case there is a need for correcting data in the source systems that will reflect in the dimension table in the data warehouse system.
Slowly changing dimensions type 1 example
Let’s take a look at an example of slowly changing dimensions type 1 to get a better understanding.
In our sample star schema, we take DIM_PRODUCT dimension table as an example. Below is the sample data of DIM_PRODUCT dimension table.
For some reason, the product called Nikon Coolpix has a typo in the source system, and the incorrect value was loaded in the dimension table in the data warehouse system. When we apply slowly changing dimension type 1, the value of the source system changed, and the value in the data warehouse system gets overwritten.
Slowly changing dimensions type 2
In slowly changing dimension type 2, when data in the source system changed, a new version of the corresponding dimension row is created to preserve a version history and historical context of the facts. It is important to note that a new record is inserted to preserve history whenever a change is made. Therefore, we can preserve unlimited version history with slowly changing dimension type 2.
In the dimensional design, if you cannot decide what type of slowly changing dimension applies to a particular dimension, type 2 is the safest way to choose. When data in the source system changed, the dimension can use either type 1 or 2.
Slowly changing dimensions type 2 example
In figure 2 below, for example, if the product category of the product id 4 changed into Electronics. Instead of overwriting the product category, we create a new record with a single attribute with the new value. As a result, we have two records: one is to preserve the history, and one is for the current analysis.
Slowly changing dimensions type 3
In dimensional design, sometimes you will have requirements that allow business analysts to analyze facts recorded before and after the change occurred using the old value and new value. None of both types above is suitable to address these requirements. In this case, type 3 is used.
In the slowly changing dimension type 3, a pair of attributes is created to track current values and previous values. Whenever a change occurs, both attributes get updated, no new row is added. It is noticed that type 3 only keeps tracking the current value and the most recent values, not all historical values like type 2.
Slowly changing dimensions type 3 example
In the product dimension table, we created a pair of attributes called CAT_CURRENT and CAT_PREVIOUS to track the changes. As shown in figure 3 below, the product category of product Id 4 is kept track.
Slowly changing dimension summary
The following table demonstrates the action of each slowly changing dimension type and its effects on facts.
Slowly changing dimensions | Action | Effects on facts |
---|---|---|
Type 1 | Overwrite attribute value in dimension table | Restate history |
Type 2 | Insert new row in dimension table | Preserves unlimited history |
Type 3 | Create a pair of attributes to keep previous and current values. No new rows are added. | Ability to analyze facts recorded before and after change happened using old value and new value. |
In this article, you’ve learned about the most common slowly changing dimensions and corresponding techniques to deal with them in dimensional design.