Skip to content
IRI Logo
Solutions Products
  • Solutions
  • Products
  • Blog
  • BI
  • Big Data
  • DQ
  • ETL
  • IRI
    • IRI Business
    • IRI Workbench
  • Mask
  • MDM
    • Master Data Management
    • Metadata Management
  • Migrate
    • Data Migration
    • Sort Migration
  • Test Data
  • Transform
  • VLDB
  • VLOG
Leaves changing color

Introduction to Slowly Changing Dimensions (SCD)

  • by Susan Gegner

A dimension is a structure that categorizes a collection of information so that meaningful answers to questions regarding that information may be obtained. Dimensions in data management and data warehouses contain relatively static data; however, this dimensional data can change slowly over time and at unpredictable intervals. These types of dimensional data are known as Slowly Changing Dimensions (SCD).

In a data warehouse environment, a dimension table has a primary key that uniquely identifies each record and other pieces of information that are known as the dimensional data. The primary key will link to a fact table using this key. Therefore, it is important to have a model for updating and handling the current data and the historical data. There are recognized models that are known by their types:

  • Type 1 – This model involves overwriting the old current value with the new current value. No history is maintained.
  • Type 2 – The current and the historical records are kept and maintained in the same file or table.
  • Type 3 – The current data and historical data are kept in the same record. The user decides how much history is kept in the record.
  • Type 4 – In this model, the current data is maintained in two different tables; one for the current data and one that contains all the historical data.
  • Type 6 – This model is a hybrid of Type 1, Type 2, and Type3.

All the update methods for the different SCD models can be accomplished using the SortCL program in IRI CoSort or IRI Voracity. Voracity now also provides a single end-to-end job creation wizard to build these programs automatically; see below. Most types use a full outer join to match records from the master data source with records in the update source based on equating a unique key from each. Records with matches need to be updated in the master. Records in the update source that do not have a match need to be added to the master.

My articles on the different SCD types show the source or input data in files and shows the SortCL scripts that are used to update the master files. Tables can also be a source with these scripts.

In each example, I cover the steps to update a dimensional source that maintains product costs that change at irregular intervals. The key field or column is ProductCode. All source master files will contain the following data:

The master1.dat source contains:

ProductCodeCostStartDate
C123125.5020110228
F1122365.0020120101
G10119.2520110930
J245450.5020110430
S02298.7520110515

The update data will all have the same StartDate. The update source contains the following data:

ProductCodeCostStartDate
F1122425.0020120701
J245550.5020120701
M447101.7520120701
S022101.7520120701

SCD reporting is also discussed on the IRI web site here. See these articles specific to SCDs: Type 1, Type 2, Type 3, Type 4, and Type 6.

 

New SCD Graphical Wizard

Voracity users can now also use a purpose-built job creation wizard to report on SCD types 1, 2, 3, 4, and 6 in the IRI Workbench GUI, built on Eclipse. The wizard also supports the integration of sorting, expression evaluation, aggregation, new formatting, encrypting, and more.

SCD-all

After setting up the job and selecting the SCD type in the wizard, you specify the master data and update information. After that, you specify the mappings for the target, placing the fields in line with the matched source fields, and using the combo boxes as needed, work with current and historical field sets. After joining the specified information on the Join Sources page, the appropriate script is created and ready to diagram, execute, modify, or share.

The articles associated with the types above will be updated to reflect how the new SCD wizard in Voracity can auto-create these jobs for you.

Big Data Analytics in Use
The Importance of a Unified View of the Customer
categorical variable data management data warehouse dimensional data filtering grouping IRI CoSort IRI Voracity labeling SCD scripts slowly changing dimensions SortCL statistics

Related articles

Schema Data Class Search
Prepare and Protect Data for…
Connecting MariaDB and MySQL to…
Running IRI Software in a…
The IRI Platform
IRI Data Migration and Modernization
Using Ripcurrent for DB Structure…
Real-Time, Incremental Data Masking
Real-time Database Data Replication
Getting Started with IRI Ripcurrent
Mapping DB Data Types
1 COMMENT
  • Slowly Changing Dimension pada Data Warehouse’
    May 21, 2018 at 5:27 am
    Reply

    […] http://www.iri.com/blog/vldb-operations/introduction-to-slowly-changing-dimensions-scd/ […]

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

  • Big Data 66
  • Business Intelligence (BI) 77
  • Data Masking/Protection 163
  • Data Quality (DQ) 41
  • Data Transformation 94
  • ETL 122
  • IRI 229
    • IRI Business 86
    • IRI Workbench 162
  • MDM 37
    • Master Data Management 12
    • Metadata Management 25
  • Migration 65
    • Data Migration 60
    • Sort Migration 6
  • Test Data 102
  • VLDB 78
  • VLOG 40

Tracking

© 2025 Innovative Routines International (IRI), Inc., All Rights Reserved | Contact