To go Logical or Not…That is the Question?!

 

Figure 1: Basic star schema for typical point of sale data-mart

In the current design, the database does not enforce the integrity of data with respect to the definition of the fact tables. The basic fact table should contain rows only for daily sales for a store for all the products sold in a day. If we try to insert an aggregated record in the basic fact table (for week, district and brand), the database would not reject the rows. If we like to separate the fact tables according to the grain, then we would have to rely on the ETL process to enforce the granularity of the fact table. Enforcing rules through ETL process has following drawbacks:

 

   

  

  • ETL processes need to persist the metadata for populating the fact table in the first figure and the aggregated fact from the 2nd figure. 
  • Multiple ETL processes may be responsible for populating the fact table. In such cases, we would need to duplicate the rules to each process. By maintaining rules (business logic) in the database layer, we can minimize maintenance headaches because the logic is now centrally located making change aforethought and less prone to mistakes.
  • ETL processes may not be efficient to enforce rules if the process relies on SQL to enforce the rules. Database integrity constraints are more efficient than SQL.
  • If the ETL process relies on a programming language, then it would be very difficult to maintain the meta data in the ETL process.

Let us see how the design would change when we partition the dimension table to enforce the granularity of the fact and aggregate table. If we partition the dimension table based on the level of the data in the hierarchy, then we can use these partitioned dimension tables to join with the appropriate fact table. Figure 2 illustrates how the data model will look in the partitioned dimension table approach.

Figure 2: Constraint Enforced Point of Sale Mart

In the new design, the basic fact table joins with the basic dimension tables and the aggregate fact table joins with the higher-level dimension tables. In this design, the granularity of the fact table is enforced by the relational constraint defined in the database. We can use SQL to create a view over all partitioned dimension tables using SQL union all to create a logical design that is similar to the basic POS data mart as described in Figure 1. We can also create views over the basic fact table and aggregated fact table to have one fact view with multiple grain, but this depends on how the data modeler wants to present the logical model.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s