In the current design, the database does not enforce the integrity of data with respect to the definition of the fact tables. The basic fact table should contain rows only for daily sales for a store for all the products sold in a day. If we try to insert an aggregated record in the basic fact table (for week, district and brand), the database would not reject the rows. If we like to separate the fact tables according to the grain, then we would have to rely on the ETL process to enforce the granularity of the fact table. Enforcing rules through ETL process has following drawbacks:
- ETL processes need to persist the metadata for populating the fact table in the first figure and the aggregated fact from the 2nd figure.
- Multiple ETL processes may be responsible for populating the fact table. In such cases, we would need to duplicate the rules to each process. By maintaining rules (business logic) in the database layer, we can minimize maintenance headaches because the logic is now centrally located making change aforethought and less prone to mistakes.
- ETL processes may not be efficient to enforce rules if the process relies on SQL to enforce the rules. Database integrity constraints are more efficient than SQL.
- If the ETL process relies on a programming language, then it would be very difficult to maintain the meta data in the ETL process.
Let us see how the design would change when we partition the dimension table to enforce the granularity of the fact and aggregate table. If we partition the dimension table based on the level of the data in the hierarchy, then we can use these partitioned dimension tables to join with the appropriate fact table. Figure 2 illustrates how the data model will look in the partitioned dimension table approach.
In the new design, the basic fact table joins with the basic dimension tables and the aggregate fact table joins with the higher-level dimension tables. In this design, the granularity of the fact table is enforced by the relational constraint defined in the database. We can use SQL to create a view over all partitioned dimension tables using SQL union all to create a logical design that is similar to the basic POS data mart as described in Figure 1. We can also create views over the basic fact table and aggregated fact table to have one fact view with multiple grain, but this depends on how the data modeler wants to present the logical model.