Skip to main content

Delta Vault

The Delta Vault is a Data Vault model that runs on a Data Lake infrastructure configured for Delta Lake.

What is Delta Vault?

Delta Vault combines Data Vault modeling principles (Hubs, Links, Satellites) with Delta Lake's ACID transaction support. Instead of loading into SQL Server tables, the Data Vault structures are materialized as Delta tables in a lakehouse (Databricks or Fabric Lakehouse).

Why Use Delta Vault?

BenefitDescription
ScalabilityDelta Lake handles petabyte-scale datasets that would strain traditional SQL Server Data Vaults
CostObject storage (Azure Data Lake Gen 2) is significantly cheaper per TB than SQL Server compute
ACID guaranteesDelta Lake provides transaction support, unlike raw Parquet — so Data Vault temporal patterns (satellite effectivity) work correctly
Schema evolutionDelta Lake supports schema evolution, making it easier to add satellite attributes over time

How to Configure

Delta Vault uses the Databricks integration template with connections configured for Delta Lake targets. The key configuration differences from a SQL Server Data Vault are:

  1. Target Connection System Type: Set to Databricks instead of SQL Server
  2. Integration Template: Use DBR (Databricks) on the project
  3. Delta Lake enabled: Ensure the target connection has Delta Lake settings configured

For detailed configuration, see:

Limitations

  • Delta Vault requires Databricks or Fabric Lakehouse runtime — it cannot target plain Parquet or CSV files
  • Some Data Vault patterns (Bridge tables, PIT tables) may have different performance characteristics on Delta Lake compared to SQL Server