Dimensional model in SCD2 (part 1)

Dimensional model in SCD2 (part 1)

This post is dedicated to an issue I’ve been dealing with for a while now and until so far I haven’t found a solid solution yet. Since I’ve been explaining and asking this to several experts over and over again, I’ll dedicate a blog post to it.

I’m a big fan of using Kimball’s Slowly Changing Dimensions, especially since it keeps track of historical data and it doesn’t explode your database. But… mapping this data into an SSAS dimension is still an effort, and let me explain my challenge by using some example data:

First of all, my dimension table looks like this:

Let’s take John’s example. John started the company on January 1st 2009 in the sales department in Delft. Halfway through the year they realized that he is needed in the sales team in Amsterdam, therefore John is transferred to the Amsterdam office on June 1st. Since John is staying in Amsterdam longer than expected, he is assigned another manager in February 2010.

In my fact table, the sales activities of John look like this:

In wording, he won a deal of 3000 euros when he was in the Delft office, a deal of 1000 euros after he moved to Amsterdam and another deal of 6000 euros after he was assigned another manager.

With this data, what I would like to achieve first of all is an analysis of the sales activities per location with a drilldown of sales reps on that location. In other words, it should show that the Delft location has a 7000 euro turnover (record 1 and 2) and that the Amsterdam location also has 7000 (record 3 and 4). With the drilldown, I should be able to see that it was John who was responsible for these sales.

Into SSAS where my challenge already starts with the creation of the dimension. These are the steps that I take:

1. Creation of the dimension: I create the dimension according to my dimension table, and the dimension is created as usual. No problems here.

2. Adding a hierarchy: I add a hierarchy according to the following structure: Location –> Business Key –> Surrogate Key.

3. Attribute relationships: By creating the hierarchy in the previous step, I also have to set the attribute relationships accordingly.

4. Deploy and process the dimension.

Processing the dimension generates the error that a duplicate key attribute has been found on the field “BusinessKey”. I understand the error, because if the location is a parent of the business key in the hierarchy, it goes wrong because there are more than 1 locations for one business key.

Then I continue with a few settings:

5. I set the key value for the field “BusinessKey” to the fields “BusinessKey” (as it originally was) and the field “SurrogateKey”.

6. Deploy and process the dimension.

The processing goes well, but when I browse the dimension, the hierarchy is displayed as follows:

  • Delft
    • Erik Monchen
      • 1
    • Erik Monchen
      • 2
    • Erik Monchen
      • 3
    • John Smith
      • 4
  • Amsterdam
    • John Smith
      • 5
    • John Smith
      • 6

I also understand why this is the behaviour because the key value of the business key is set to the business key and the surrogate key, but this is the only way how I can use the dimension without having errors…

In the end, I would like to achieve a structure like this:

  • Delft
    • Erik Monchen
      • 1
      • 2
      • 3
    • John Smith
      • 4
  • Amsterdam
    • John Smith
      • 5
      • 6

This way, it’s easy to link the fact table to the surrogate key value and the sales totals are added up correctly per location. Also, when I browse the cube where I’ve linked this dimension, it will display the location first, then when I drill down, it displays the names over and over again because that’s the next level in the hierarchy. If there is a way to change this view, I’d be open to use that as well though and not change anything to the dimension! The whole point of this exercise is to avoid having a list of double names when I browse the cube…

Since the method of Slowly Changing Dimensions is widely used, I was wondering how other people have accomplished this. I’m sure I’m not the only one out there facing this challenge!

Thanks in advance for your feedback!

4 thoughts on “Dimensional model in SCD2 (part 1)

  1. This should be easy if you have build the relation as follows:-
    surrogate_key–>business_key+location_key–>Location_Key

    The name for each of the 3 would be
    surrogate_key–>Name–>Location

  2. Thanks for your quick reply. I haven’t tried it out yet but in this case I’m talking about the location, but there are many other attributes like gender, cost center, language etc. that I’m using in this dimension. Do I need to build the relation for each of the attributes separately then?

  3. Gender – that would be related to your sales rep so it shouldnt be an issue anyways.
    Cost Center – I believe that is something that is a parent of Location, so hence it should be fine too. If not, please explain
    Language – I am not sure where this fits, so if you could give the entire hierarchym I could comment further

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.