Integration Keys
The Enterprise-Wide Business Key (EWBK) is a core concept in Data Vault modeling. It allows the modeler to focus on defining entities and relationships based on the business process focused keys, rather than the technical keys from the operational (source) system. This allows easier cross-system integration in the Data Vault.
While defining an EWBK is the design goal for a Data Vault model, it is rare to find these directly available in existing source systems.
BimlFlex uses the concept Integration Key instead of Business Key for consistency across different Data Warehouse project types, including Dimensional Modeling.
The Integration Key is a core concept for BimlFlex. It is explained in more detail in the Concepts section.
The Integration Key concept is a pragmatic design feature in BimlFlex, which supports both the definition of a true Enterprise-Wide Business Key as well as making it easier to allow cross-system integration where this key cannot be easily defined.
Integration Keys prevents false-positive matches that may occur when relating data sets between systems, and allows for a later matching process using Same-As Links (SAL
).
Once Integration Keys are defined and available it is possible to match entities using rules or Master Data Management.
Watch the Webinar on building Integration Keys for Data Vault. The video contains an overview of the Business Key and introduces the Integration Key concept.
Surrogate Keys
By default, BimlFlex will 'hash' the Integration Key value to be used as a Surrogate Key column. The Surrogate Key will act as the Primary Key for a Hub and Link Objects, and be part of the Primary Key for Satellite Objects.
Because of this, the Surrogate Key will be used in Data Vault Foreign Key references if referential integrity is configured.
The Data Type and applied hashing algorithm are configurable through the BimlFlex Settings. The default used by BimlFlex is to use the SHA1
algorithm and store this value as Binary value.
Integration Key Settings for Data Vault
BimlFlex offers many Settings to manage how the integration key will be used in the solution. Choose a tab below to view relevant setting descriptions or examples for integration keys.
Select:
Choose a tab to view either setting descriptions or examples.
Description
Group | Section | Type | Setting | Description |
---|---|---|---|---|
Core | Defaults | Hash Null Value Replacement | The Null value replacement to be used when hashing | |
Core | Defaults | SSIS Hash Null Value Replacement | The Null value replacement to be used when hashing using the Varigence BimlFlex SSIS Custom component. Provides backwards compatibility when set to an empty string. For new implementations and SQL hash compatibility, set to the same value as used for HashNullValue | |
Core | Defaults | Integration Key To Upper | Should strings in the Integration Key be uppercased. This is recommended and allows the standard SQL Server case insensitive collation to ingest business keys from multiple sources using different casings to be added to Hubs and treated as the same key without issues | |
Core | Defaults | String Concatenator | The string value used in concatenating Integration Keys and Hash values. Defaults to " | |
Data Vault | Accelerator | Use Hash Keys | Should the Data Vault use Hash Keys or Natural Keys | |
Data Vault | Accelerator | Accelerate Link Satellite Keys | Should the accelerator add the Integration Key to the Link Satellites for effectiveness | |
Data Vault | Accelerator | Accelerate Hub Keys | Should the Accelerator add source key columns to the Hub in addition to the Integration Key | |
Data Vault | Accelerator | Accelerate Link Keys | Should the Accelerator add source key columns to the Link in addition to the Integration Key | |
Model | Settings | Infer Integration Key From | Where to infer the Integration Key from. Case sensitive options are "None", "PrimaryKey", "UniqueKey", "FirstColumn", "IdentityColumn" and "ColumnName::[NameOfColumn]". When specifying "ColumnName", a name needs to be added in the Import Metadata screen or specify "ColumnName::UID" to auto populate the column name field with "UID" | |
Model | Settings | Pad Integration Key | Number of Characters to pad the Integration Key to | |
Model | Settings | Append Integration Key | The string to append to Integration Keys | |
Model | Settings | Add Record Source To Integration Key | Import Metadata will add "@@rs" to Integration Keys if true | |
Model | Settings | Change References To Integration Key | Should Import Metadata add derived Integration Keys from source references or use source columns for references | |
Naming | Naming | Suffix Or Prefix Column | The "SuffixOrPrefixColumn" key defines the behavior when defining column names. Use Suffix or Prefix to define if the column identifiers are added after or before the column names in the solution. "S" for Suffix will generate "Entity_BK" "P" for Prefix Will generate "BK_Entity" |
Default Value
Group | Section | Type | Setting | Default Value |
---|---|---|---|---|
Core | Defaults | Hash Null Value Replacement | NVL | |
Core | Defaults | SSIS Hash Null Value Replacement | ||
Core | Defaults | Integration Key To Upper | true | |
Core | Defaults | String Concatenator | ~ | |
Data Vault | Accelerator | Use Hash Keys | true | |
Data Vault | Accelerator | Accelerate Link Satellite Keys | true | |
Data Vault | Accelerator | Accelerate Hub Keys | false | |
Data Vault | Accelerator | Accelerate Link Keys | false | |
Model | Settings | Infer Integration Key From | PrimaryKey | |
Model | Settings | Pad Integration Key | 100 | |
Model | Settings | Append Integration Key | BK | |
Model | Settings | Add Record Source To Integration Key | true | |
Model | Settings | Change References To Integration Key | true | |
Naming | Naming | Suffix Or Prefix Column | S |
The Model
Setting Group is only applied when first creating and Integration Key. These Settings do not already impact generated Integration Keys.
- The Accelerate Hub Keys setting will include the key parts in the Hub.
- BimlFlex provides an option to add the
@@rs
to all keys on import Add Record Source to Integration Key and on the Columns tab for the Objects in the app. - The column name is derived by combing the Object MODEL OVERRIDE NAME if specified otherwise the Object OBJECT NAME and the Append Integration Key setting either before or after the name depending on the Suffix or Prefix Column setting.
- Deriving the value of the concatenated key depends on two settings.
The Hash NULL Value Replacement that is used as a null replacement and the Integration Key To Upper* to specify if the derived value should be cast to
UPPER CASE
or left in its original case. When integrating case sensitive systems, this requires consideration. - As an example, the above will be implemented as follows
UPPER(COALESCE(CustomerID, 'NVL'))
- It is recommended to use a wide Unicode String datatype. Allow for new sources that might not adhere to the assumed datatype of the Hub Integration key.