About me: Kaushal Shah
I am a Microsoft® Certified Solutions Associate (MCSA) for SQL Server 2012,Currently working in Microsoft BI space.I always keen to learn new technology and new skills.I have worked within different sectors of business.I have worked in Wholesale Sport industry,Retail sector,E-commerce and lately in Healthcare.
In this post I am trying to explain commonly used Dimensions Types used in Data warehouse.I have put examples with this also to make it more clear.
What are the Dimensions in Data warehouse?
A dimension table consists of the attributes about the facts. Dimensions store the textual descriptions of the business attribute. Without the dimensions, we cannot measure the facts and facts are just disordered Numbers.In Business, Customer, Products, Buyers information can be different dimensions.
Let’s walk through commonly used types of dimensions.
Types of commonly used Dimensions in data warehouse Design
- Conformed Dimensions
- Junk Dimensions
- Role-playing Dimensions
- Slowly Changing Dimensions
- Degenerated Dimensions
A Dimension that is used in multiple locations is called conformed dimensions. A conformed dimension may be used with multiple fact tables in single database, or across multiple data marts or Data warehouses.
I.e. Above shown Customer and Product Dimensions are Conformed Dimensions as they are connected to Shipment Fact table, Sales Order Fact table, and Service Request Fact table.
A junk dimension is a collection of random transaction codes flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.
I.e.: Assume that we have a gender dimension and marital status dimension. In the fact table we need to maintain two keys referring to these dimensions. Instead of that create a junk dimension which has all the combinations of gender and marital status (cross join gender and marital status table and create a junk table). Now we can maintain only one key in the fact table.
Role Playing Dimensions are the Dimensions which often used for multiple purposes within same database.Here same dimension key is associated with more than one foreign key in the fact table in the database for the different purposes.
I.e.: In Date dimensions, [FullDateAlternateKey] is associated with [Orderdate key], [Duedate key], and [Shipdate] key in the fact table to solve different purpose in Data warehouse.
Slowly Changing Dimensions
This is widely used Dimensions type. It is the dimensions where attribute values changes with time. There are various types of Slowly Changing Dimensions (SCD) based on how business manages this dimensions.
Types of SCD
TYPE 0: It is the dimensions where we do not change attribute values at all. They are rarely used. I.e. Employee birth date
TYPE 1: In this type, Old value of attribute is overwritten by new values of attribute and no history kept
I.e Customer City where company decided to show only current one.
In this case previous city name London is replaced by new city name Edinburgh.
TYPE 2: In this type we tracks historical data by creating multiple records for a given Natural key (business key) in the dimensional tables with separate surrogate key and/or different version numbers. Unlimited history is preserved for each insert.
I.e. Customer City where company decided to have historical data then we will have to add an extra row with column to identify the Current/Historical attributes value by start and end date columns.
TYPE 3: In this type, we tracks changes using separate columns and preserves limited history.it is limited to how many columns we want to add in dimension table.
I.e. Customer City where New columns “previous City” and “Current City” being added.
TYPE 4: In this type, we keep all or some historical data in separate table and current data stays in main Dimension table. Both historical and current dimension table joined to fact table with same surrogate key, this will enhance the query performance. This type used very rarely.
I.e. we create new table to store previous Customer City and Current Customer City in Historical table with Created date And Current Customer city in Current dimension table.
A degenerate dimension is a dimension which is derived from the fact table and doesn’t have its own dimension table.
In Data warehouse this Dimension often used to show drill through capability where in the report you can see how aggregated no came up.
I.e. Invoice no can be stored in the fact table and then used as separate dimensions for the drill through purpose to find out what invoices are part of total buying cost in report.
So Dimensions are one of the pillar of the data warehouse.Choosing right one can define future of the data warehouse.It always good to use right type
Hope this post gives some insight and information around data warehouse design.
Has anyone came across Rapidly Changing Dimensions??
Any suggestion and/or comment always welcome.
In the next post,I will walk through Different Types of Facts and Fact tables.
Happy Learning !!!