Extracting All Tables to Raw Files
Why Generate the Biml From SQL
Extract is the boring part of an ETL project, especially when every table needs the same shape of dataflow into a raw file destination. Instead of writing the Biml by hand, T-SQL queries against 'INFORMATION_SCHEMA' can emit the connections, file formats, and dataflows directly. Pasting the query output into a Biml template produces a single package with one dataflow per table.
Target Shape of the Biml File
Each generated package has three sections that need to be filled in:
- A 'Connections' section with one 'FileConnection' per table, named after the schema and table
- A 'FileFormats' section with one 'RawFileFormat' per table, listing the columns
- A 'Packages' section with one 'Dataflow' per table that pairs an OLE DB source with a raw file destination
The skeleton looks like this:
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="Source" ConnectionString="Provider=SQLOLEDB;Data Source=localhost\SQL2019;Integrated Security=SSPI;Initial Catalog=OperationsDW" />
<!-- generated FileConnection list goes here -->
</Connections>
<FileFormats>
<!-- generated RawFileFormat list goes here -->
</FileFormats>
<Packages>
<Package Name="ExportAllTables" ConstraintMode="Parallel" ProtectionLevel="DontSaveSensitive">
<Tasks>
<!-- generated Dataflow list goes here -->
</Tasks>
</Package>
</Packages>
</Biml>
Generating the File Connections
A query against 'INFORMATION_SCHEMA.TABLES' produces one 'FileConnection' element per base table, with the file path built from the schema and table name:
DECLARE @output_directory nvarchar(50) = 'F:\\DB_DUMP';
SELECT
'<FileConnection Name="'
+ TABLE_SCHEMA + '_' + TABLE_NAME
+ '" FilePath="'
+ @output_directory + '\\'
+ TABLE_SCHEMA + '_' + TABLE_NAME + '.raw" />'
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE';
Paste the result into the 'Connections' node.
Generating the File Formats
Each 'RawFileFormat' needs the list of columns to write. The raw file destination cannot accept some types, including geography, xml, and unbounded varbinary or nvarchar, so those types are excluded from the column list. Otherwise the source columns map to the destination by name and the package fails validation.
The query below uses 'FOR XML PATH' nested twice to produce one 'RawFileFormat' per table with its column list as a child node:
SELECT
TABLE_SCHEMA + '_' + TABLE_NAME AS "@Name",
(
SELECT COLUMN_NAME AS "@Name"
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = t.TABLE_NAME
AND TABLE_SCHEMA = t.TABLE_SCHEMA
AND DATA_TYPE <> 'geography'
AND NOT (DATA_TYPE = 'nvarchar' AND CHARACTER_MAXIMUM_LENGTH = -1)
AND NOT (DATA_TYPE = 'varbinary' AND CHARACTER_MAXIMUM_LENGTH = -1)
AND DATA_TYPE <> 'xml'
FOR XML PATH('Column'), TYPE, ROOT('Columns')
)
FROM INFORMATION_SCHEMA.TABLES t
WHERE t.TABLE_TYPE = 'BASE TABLE'
FOR XML PATH('RawFileFormat'), TYPE, ROOT('FileFormats');
Click the result XML to open the full document and paste it over the 'FileFormats' node.
Generating the Dataflows
The final query emits one 'Dataflow' element per table. The 'FOR XML PATH' column aliases use forward-slash syntax to nest child elements under 'Transformations':
SELECT
'EXTRACT_' + TABLE_SCHEMA + '_' + TABLE_NAME AS "@Name",
'src_' + TABLE_SCHEMA + '_' + TABLE_NAME AS "Transformations/OleDbSource/@Name",
'Source' AS "Transformations/OleDbSource/@ConnectionName",
'Select * from ' + TABLE_SCHEMA + '.' + TABLE_NAME AS "Transformations/OleDbSource/DirectInput",
'dest_' + TABLE_SCHEMA + '_' + TABLE_NAME AS "Transformations/RawFileDestination/@Name",
TABLE_SCHEMA + '_' + TABLE_NAME AS "Transformations/RawFileDestination/@RawFileFormatName",
'CreateAlways' AS "Transformations/RawFileDestination/@WriteOption",
TABLE_SCHEMA + '_' + TABLE_NAME AS "Transformations/RawFileDestination/FileOutput/@ConnectionName"
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_SCHEMA, TABLE_NAME
FOR XML PATH('Dataflow'), ROOT('Tasks');
The result is the entire 'Tasks' node ready to paste into the package.
Shape of a Single Dataflow
Each generated dataflow is identical in structure:
<Dataflow Name="EXTRACT_Person_Address">
<Transformations>
<OleDbSource Name="src_Person_Address" ConnectionName="Source">
<DirectInput>Select * from Person.Address</DirectInput>
</OleDbSource>
<RawFileDestination Name="dest_Person_Address" RawFileFormatName="Person_Address" WriteOption="CreateAlways">
<FileOutput ConnectionName="Person_Address" />
</RawFileDestination>
</Transformations>
</Dataflow>
Putting It Together
Run the three queries, paste each result into the matching section of the Biml skeleton, then compile. The output is a single SSIS package with one dataflow per table, each writing to its own raw file. The same approach extends to other source platforms by changing the catalog views and the connection string; the Biml shape stays the same.