Slowly Changing Fact (SCF)
In the world of data warehousing and dimensional modeling, particularly in the Kimball approach, the concept of Slowly Changing Fact (SCF) plays a crucial role. SCFs help track fact table rows that change gradually over time, such as product prices, commission rates, or employee salaries.
🔑 Key Differences: SCF vs. SCD
- SCF: Focuses on fact tables and measures that change slowly.
- SCD: Focuses on dimension tables and attributes that change over time (e.g., customer addresses).
🛒 Use Case: E-commerce Example
Imagine an e-commerce platform where order statuses and related fields (e.g., shipping address, delivery date) evolve over time. SCFs allow us to track these changes for accurate historical analysis. For instance:
| OrderID | OrderStatus | ShippingAddress | EstimatedDeliveryDate | EffectiveDate | Active_Flag |
|---|---|---|---|---|---|
| 201 | Placed | 456 Elm St, NY | 2023-10-15 | 2023-10-01 | False |
| 201 | Placed | 123 Main St, NY | 2023-10-20 | 2023-10-03 | False |
| 201 | Shipped | 123 Main St, NY | 2023-10-20 | 2023-10-10 | False |
| 201 | Delivered | 123 Main St, NY | 2023-10-20 | 2023-10-20 | True |
In this case, the order status for OrderID 201 evolves from Placed to Shipped and finally to Delivered. Each change is recorded as a new row in the fact table. The EffectiveDate column indicates when the change occurred, and the Active_Flag column identifies the current state of the order. For example:
- On
2023-10-01, the order was placed with the shipping address456 Elm St, NY. - On
2023-10-03, the shipping address was updated to123 Main St, NY, while the status remainedPlaced. This impacted the estimated delivery date, which changed to2023-10-23. - On
2023-10-10, the order status changed toShipped. - On
2023-10-20, the order was marked asDelivered, and this row is marked as active (Active_Flag = True).
This approach ensures all historical changes are preserved for analyzing order trends and lifecycle.
⚠️ Limitations of SCF
- Increased storage requirements.
- Complex queries due to multiple rows for the same entity.
- Performance overhead during ETL processes.
✅ Best Practices for SCF
- Use effective start and end dates to track row validity.
- Partition fact tables by date for better performance.
- Archive older rows to manage storage.
- Document SCF implementation for team clarity.
By implementing SCFs effectively, organizations can unlock valuable insights into trends and changes over time while maintaining historical accuracy. 💡