SCD Type 2 in ODI 11g

Continuing on my SCD series in ODI, I am implementing SCD Type 2 today. As compared to SCD Type 1, implementing Type 2 was much simpler. There are few things that one has to take care of. Apart from that, it was a walk in the park.

First things first, we need to setup the underlying table to support SCD Type 2 modifications. I’ll update the same SAMP_OFFICES_D table that I mentioned in my previous post. So, I am adding three columns to the table – START_DATE, END_DATE and ACTIVE_FLAG.

For SCD Type 2, we need to use IKM Slowly Changing Dimension and use database sequences for the surrogate key, just like SCD 1. Following the similar steps of adding the data stores and then reverse engineering the model, we move ahead to the important bit.

In the Definition tab of your data store, select Slowly Changing Dimension.

Figure 1: Setting the type of the data store to Slowly Changing Dimension

Define SCD behaviour for every column

Once we set the data store to Slowly Changing Dimension, we then need to set the SCD type of every column. (NOTE: every column).

Figure 2: Setting the SCD behaviour of all table columns

Setting the values

Set the following values in the target data store.

Figure 3: Target Data store mappings

The above mentioned steps are not that detailed but are the most important ones. Good luck SCDing.

Side Note

We can check the database table SNP_COL to check the SCD behaviour of all the columns (refer Figure 2). As you can see below, I have intentionally left COMPANY_KEY undefined.

Figure 4: Table in the work repository where we can check the SCD behaviour settings of individual columns

When we run the interface, we will get the following error as we have not defined the SCD behaviour for COMPANY_KEY. So, for all the 20 rows that the interface tries to insert into the dimension table, we get 20 errors as shown below in the error table E$_SAMP_OFFICES_D.

Figure 5: Not defining the SCD behaviour for any column will give an error at runtime

SCD Type 1 in ODI 11g

I have just started developing in ODI 11g and I really got frustrated implementing a simple SCD Type 1. I am using the samplesales schema that comes along with OBIEE and made some modifications. For example, I added a surrogate key named OFFICE_ID to the Offices dimension. I exported the table data to a CSV file and am trying to import that file into the SAMP_OFFICES_D dimension.

For Type 1, you can use the Incremental Update IKM and use database sequences for the surrogate key. For some reason, it wasn’t working. Duplicate rows were getting added even when no changes were made to the data (It should ideally do nothing, even if it was a SCD Type 2). The solution is pretty standard – you only need to know what you’re doing and setting the right options. Read on.

Step 1 – Create the database sequence object

Step 2 – Reverse-engineer your models

Step 3 – Configure the target data store

In the Definition tab of your data store, select Dimension as the OLAP Type. This is important because if you do not mention this as a Dimension and then not define an Update key (which you don’t to implement SCD), then we’ll get the following error.

Figure 1: Error when data store is not set as a Dimension and Update key is set to Undefined

We use the Update Key to define how rows are compared to see if they are the same and since in our case that will be the surrogate key, new rows will be added every time. To avoid that, we set the Update key to <Undefined>.

Side question – How do we then control the specific columns to match on? This can be achieved by setting the Update key to <Undefined> and setting the Key property on the specific columns of the target data store.

Step 4 – Design your interface

Following points should be kept in mind when designing the interface –

  1. Select the surrogate key column of the target data store
    1. Select Execute On as Target
    2. Make sure the Key checkbox is not selected (it gets automatically unchecked if Update key is set to <Undefined>)
    3. Clear the Check Not Null checkbox (as this is an auto-generated sequence)
    4. Clear the Update checkbox (as we will (/should) never update the surrogate key column)
    5. In the implementation box, click on the pencil to edit the expression and then select the sequence that you created. The sequence can either be a project specific sequence or a global sequence (created in Global objects)
  2. Select the natural key or unique key column of the target data store. In my case, it is going to be OFFICE_KEY (the unique key to identify an Office)
    1. Check the Key checkbox; this is the key that will be used by ODI to make comparisons during inserts/updates of the IKM
    2. Clear the Update checkbox (because in my case, the OFFICE_KEY will never be updated)
    3. If more than one column identify a row as unique then mark each one of them as Key columns in a similar fashion

That’s it! Execute the interface and your SCD Type 1 interface should work. I’ll talk about SCD Type 2 tomorrow.

Why my ODI sequence is not working

I am making a sample application in ODI using the samplesales schema that comes along with OBIEE. I truncated the SAMP_OFFICES_D dimension table and trying to load the data from a text file. I added a surrogate key to the table, and loading the value using an ODI sequence.

ODI sequences come in three flavours – standard, specific and native. Standard sequences are sequences whose current values are maintained in the repository. Specific sequence values are stored in a database table cell. ODI will read the value first, lock that row and then update that same row once the job is completed. Both standard and specific should be used when the underlying database does not have proper support for sequences (weird, can’t think of any DBMS w/o sequences). Native, as the name suggests, are the native sequences managed by the underlying database system.

I created a standard sequence and used :SEQ_OFFICE_ID_NEXTVAL for the surrogate key mapping. When I ran the interface, I got the unique constraint violation error for the primary key (surrogate key OFFICE_ID). When I read the documentation, I found that Standard and Specific constraints cannot be used in KMs or procedures that use a single INSERT/UPDATE… SELECT in the target command to load the data.

I was using the IKM SQL Incremental Update KM and as the screenshot above clearly says, it uses INSERT INTO .. SELECT FROM statement. In such cases, standard or specific sequences cannot be used and the only work-around is to use native sequences.