Skip to main content

Defining the Data Flow in Biml

Why This Pattern Matters

Biml describes SSIS data flows declaratively. For straight-line flows the compiler can infer how to connect components, but the moment a Lookup or Conditional Split needs a specific output, the connections must be made explicit. This walkthrough shows how default outputs work, when they are not enough, and how to use 'InputPath' to wire a non-default path.

A Target Table

The examples send rows into a destination table named 'AccountDimStaging' in a database named 'StagingDB'. The Biml below describes the table and can be created using the table-creation pattern from the earlier walkthrough.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="SourceConn" ConnectionString="Provider=SQLNCLI11;Server=.;Initial Catalog=SourceWarehouse;Integrated Security=SSPI;" />
<OleDbConnection Name="TargetConn" ConnectionString="Provider=SQLNCLI11;Server=.;Initial Catalog=StagingDB;Integrated Security=SSPI;" />
</Connections>
<Tables>
<Table Name="AccountDimStaging" ConnectionName="TargetConn">
<Columns>
<Column Name="AccountKey" />
<Column Name="ParentAccountKey" IsNullable="true" />
<Column Name="AccountAltKey" IsNullable="true" />
<Column Name="ParentAccountAltKey" IsNullable="true" />
<Column Name="AccountDescription" DataType="String" Length="50" IsNullable="true" />
<Column Name="AccountType" DataType="String" Length="50" IsNullable="true" />
<Column Name="Operator" DataType="String" Length="50" IsNullable="true" />
<Column Name="CustomMembers" DataType="String" Length="300" IsNullable="true" />
<Column Name="ValueType" DataType="String" Length="50" IsNullable="true" />
<Column Name="CustomMemberOptions" DataType="String" Length="200" IsNullable="true" />
</Columns>
</Table>
</Tables>
</Biml>

A Simple Dataflow

For a one-source one-destination flow the compiler does the wiring. No data path elements are needed.

<Dataflow Name="Dataflow 1">
<Transformations>
<OleDbSource Name="Source" ConnectionName="SourceConn">
<DirectInput>SELECT * FROM dbo.AccountDim</DirectInput>
</OleDbSource>
<OleDbDestination Name="Target" ConnectionName="TargetConn">
<ExternalTableOutput Table="dbo.AccountDimStaging" />
</OleDbDestination>
</Transformations>
</Dataflow>

The compiler infers that the source's default output should be connected to the destination's input.

Default Outputs

Most components have a default output named 'Output'. When no other information is given, the compiler links the default output of one transformation to the input of the next transformation in the Biml. That is fine for simple flows, but breaks down once components have multiple meaningful outputs.

The flow below adds a Lookup and a Conditional Split. With no path overrides, the compiler picks the default output of each, which produces the wrong result: the Lookup's 'Match' output is used when the design wants the 'NoMatch' output, and the Conditional Split's default output is used when the design wants the 'High ID' output.

<Dataflow Name="Dataflow 2">
<Transformations>
<OleDbSource Name="Source" ConnectionName="SourceConn">
<DirectInput>SELECT * FROM dbo.AccountDim</DirectInput>
</OleDbSource>
<Lookup Name="Check For Existing" OleDbConnectionName="TargetConn" NoMatchBehavior="RedirectRowsToNoMatchOutput">
<DirectInput>SELECT AccountKey FROM dbo.AccountDimStaging</DirectInput>
<Inputs>
<Column SourceColumn="AccountKey" TargetColumn="AccountKey" />
</Inputs>
</Lookup>
<ConditionalSplit Name="Test ID Range">
<OutputPaths>
<OutputPath Name="High ID">
<Expression>AccountKey >= 100</Expression>
</OutputPath>
</OutputPaths>
</ConditionalSplit>
<OleDbDestination Name="Target" ConnectionName="TargetConn">
<ExternalTableOutput Table="dbo.AccountDimStaging" />
</OleDbDestination>
</Transformations>
</Dataflow>

Explicit InputPath

Adding an 'InputPath' element to a downstream component tells the compiler which output should feed it. The example below routes the Lookup's 'NoMatch' output into the Conditional Split, then routes the Conditional Split's 'High ID' output into the destination.

<Dataflow Name="Dataflow 3">
<Transformations>
<OleDbSource Name="Source" ConnectionName="SourceConn">
<DirectInput>SELECT * FROM dbo.AccountDim</DirectInput>
</OleDbSource>
<Lookup Name="Check For Existing" OleDbConnectionName="TargetConn" NoMatchBehavior="RedirectRowsToNoMatchOutput">
<DirectInput>SELECT AccountKey FROM dbo.AccountDimStaging</DirectInput>
<Inputs>
<Column SourceColumn="AccountKey" TargetColumn="AccountKey" />
</Inputs>
</Lookup>
<ConditionalSplit Name="Test ID Range">
<InputPath OutputPathName="Check For Existing.NoMatch" />
<OutputPaths>
<OutputPath Name="High ID">
<Expression>AccountKey >= 100</Expression>
</OutputPath>
</OutputPaths>
</ConditionalSplit>
<OleDbDestination Name="Target" ConnectionName="TargetConn">
<InputPath OutputPathName="Test ID Range.High ID" />
<ExternalTableOutput Table="dbo.AccountDimStaging" />
</OleDbDestination>
</Transformations>
</Dataflow>

Reference: Outputs by Component

A few more facts about data paths are worth keeping in mind:

  • Most components have a default output named 'Output' and an 'Error' output if errors are supported.
  • 'Multicast' has no default output, so the path mapping must always be explicit.
  • 'UnionAll', 'Merge', and 'MergeJoin' need explicit mappings because they accept multiple inputs.
  • The Slowly Changing Dimension transformation has multiple outputs: 'New' (the default), 'Unchanged', 'FixedAttribute', 'ChangingAttribute', 'HistoricalAttribute', and 'InferredMember'.
  • 'PercentageSampling' and 'RowSampling' have outputs named 'Selected' (the default) and 'Unselected'.

When in doubt, set 'InputPath' explicitly. The extra Biml is small, and the resulting dataflow leaves no ambiguity for the compiler or the next reader.