Export Data to Flat Files with Biml
Why Automate the Flat File Side
Generating one flat file per table by hand is repetitive: a format definition, a connection, and a data flow for every source. The same metadata that drives an SSIS staging load can drive the export side too. The pattern below pulls schema from a source database, builds one FlatFileFormat per table with annotations for the source SQL, then creates connections and an export package automatically.
Environment Definition
A static Biml file points the project at the source database. Other tiered files reuse this connection by name:
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="OpsSource" ConnectionString="Data Source=localhost;Initial Catalog=OperationsDB;Provider=SQLNCLI11;Integrated Security=SSPI" />
</Connections>
</Biml>
Building Flat File Formats from Metadata
The next file loops through every table returned by GetDatabaseSchema and produces a FlatFileFormat. Columns of type Binary or Object are skipped because flat files cannot represent them. Every remaining column is declared as a string of maximum length, since the export does not care about strong typing on the file. Three Annotations are attached to each format: the fully scoped source table name, a comma separated column list, and the list of primary key columns (used later when an ORDER BY is needed):
<#@ template language="VB" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<FileFormats>
<#
Dim sourceConnection As AstDbConnectionNode = RootNode.Connections("OpsSource")
Dim delimiter As String
Dim columnList As String
Dim keyColumnList As String
Dim sourceTables As ImportResults = sourceConnection.GetDatabaseSchema()
For Each tbl As AstTableNode In sourceTables.TableNodes
columnList = ""
keyColumnList = ""
#>
<FlatFileFormat Name="<#= tbl.Schema #>_<#= tbl.Name #>" ColumnNamesInFirstDataRow="true">
<Columns>
<#
For Each col As AstTableColumnBaseNode In tbl.Columns.Where(Function(c) Not (c.DataType = System.Data.DbType.Binary Or c.DataType = System.Data.DbType.Object))
If columnList <> "" Then columnList = columnList + ", "
columnList = columnList + "[" + col.Name + "]"
If col.IsUsedInPrimaryKey = True Then
If keyColumnList <> "" Then keyColumnList = keyColumnList + ", "
keyColumnList = keyColumnList + "[" + col.Name + "]"
End If
delimiter = ","
If col.Name = tbl.Columns.Last.Name Then delimiter = "CRLF"
#>
<Column Name="<#= col.Name #>" DataType="String" Length="-1" Delimiter="<#= delimiter #>" />
<#
Next
If keyColumnList = "" Then keyColumnList = "1"
#>
</Columns>
<Annotations>
<Annotation AnnotationType="Tag" Tag="SourceTable"><#= tbl.ScopedName #></Annotation>
<Annotation AnnotationType="Tag" Tag="SourceColumns"><#= columnList #></Annotation>
<Annotation AnnotationType="Tag" Tag="KeyColumns"><#= keyColumnList #></Annotation>
</Annotations>
</FlatFileFormat>
<# Next #>
</FileFormats>
</Biml>
The annotations are the bridge between metadata and the eventual export logic: any downstream Biml script can ask a flat file format for the source table or column list without recomputing it from schema.
Connections per Format
A separate tier loops through every flat file format defined so far and creates a matching connection with an output path. Each connection borrows the format name for both its own name and the file name:
<#@ template language="VB" tier="2" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<# For Each ff As AstFlatFileFormatNode In RootNode.FlatFileFormats #>
<FlatFileConnection Name="<#= ff.Name #>" FileFormat="<#= ff #>" FilePath="C:\Exports\<#= ff.Name #>.csv" />
<# Next #>
</Connections>
</Biml>
The Export Package
A final tier emits a single package that contains one container per format, each holding a data flow that reads from the source using the SourceColumns and SourceTable annotations and writes to the matching flat file destination. Because the containers sit at the same level and the package uses default parallel constraint mode, the exports run in parallel:
<#@ template language="VB" tier="3" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package Name="Export_FlatFiles">
<Tasks>
<# For Each ff As AstFlatFileFormatNode In RootNode.FlatFileFormats #>
<Container Name="Export <#= ff #>">
<Tasks>
<Dataflow Name="DF <#= ff #>">
<Transformations>
<OleDbSource Name="SRC <#= ff #>" ConnectionName="OpsSource">
<DirectInput>SELECT <#= ff.GetTag("SourceColumns") #> FROM <#= ff.GetTag("SourceTable") #></DirectInput>
</OleDbSource>
<FlatFileDestination ConnectionName="<#= ff #>" Name="<#= ff #>" />
</Transformations>
</Dataflow>
</Tasks>
</Container>
<# Next #>
</Tasks>
</Package>
</Packages>
</Biml>
Compiling these tiered files produces one CSV per table in the source database. The annotations stay attached to the file format nodes, so the same metadata can be reused by other scripts, including the companion pattern that splits large tables across multiple files.