Skip to main content

SSAS Processing Patterns with Biml

Why Generate SSAS Processing Packages

When an Analysis Services database grows past the point where a single full process fits the maintenance window, processing must be broken apart. Dimensions are processed with ProcessUpdate so existing fact data is preserved, only the most recently changed fact partitions are processed with ProcessData, and cube indexes are rebuilt at the end. Writing those packages by hand for every measure group is tedious. A Biml script that reads the database structure through AMO can generate the entire processing package in seconds.

Method 1: One Processing Task per Object Type

The first pattern places one AnalysisServicesProcessing task for dimensions, one for partition data, and one for cube indexes, each configured to run its child operations in parallel. The script connects to the SSAS server through AMO, walks the dimensions, cubes, measure groups, and partitions, and emits the matching processing configurations.

<#@ assembly name="C:\Program Files (x86)\Microsoft SQL Server\110\SDK\Assemblies\Microsoft.AnalysisServices.dll" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="Microsoft.AnalysisServices" #>
<#@ template language="C#" #>

<#
String olapServer = @".\OlapInstance";
String olapDatabase = "RetailMart";
int recentPartitions = 2;
#>

<#+
Dictionary<string, string> DimensionKeys(string Server, string Database)
{
Dictionary<string, string> result = new Dictionary<string, string>();
Server amoServer = new Server();
amoServer.Connect(Server);
Database amoDb = amoServer.Databases[Database];
foreach (Dimension dim in amoDb.Dimensions)
result.Add(dim.ID, dim.Name);
amoServer.Disconnect();
amoServer.Dispose();
return result;
}

public static List<PartitionInfo> Partitions(string ServerName, string DatabaseName, string CubeId, string MeasureGroupId)
{
var list = new List<PartitionInfo>();
Server amoServer = new Server();
amoServer.Connect(ServerName);
Database amoDb = amoServer.Databases[DatabaseName];
Cube cube = amoDb.Cubes[CubeId];
MeasureGroup mg = cube.MeasureGroups[MeasureGroupId];
foreach (Partition p in mg.Partitions)
list.Add(new PartitionInfo(p.Name, p.ID, p.LastProcessed));
amoServer.Disconnect();
amoServer.Dispose();
list.Sort((y, x) => x.ProcessDate.CompareTo(y.ProcessDate));
return list;
}

public struct PartitionInfo
{
public readonly string PartitionName;
public readonly string PartitionId;
public readonly DateTime ProcessDate;
public PartitionInfo(string name, string id, DateTime processed)
{
PartitionName = name;
PartitionId = id;
ProcessDate = processed;
}
}
#>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<AnalysisServicesConnection Name="olap" ConnectionString="Data Source=<#= olapServer #>;PROVIDER=MSOLAP;Impersonation Level=Impersonate;" Server="<#= olapServer #>" />
</Connections>
<Packages>
<Package Name="OlapProcess_All" ConstraintMode="Linear" ProtectionLevel="EncryptAllWithUserKey">
<Tasks>
<AnalysisServicesProcessing Name="ProcessDimensionsUpdate" ConnectionName="olap" ProcessingOrder="Parallel">
<ProcessingConfigurations>
<#
Dictionary<string, string> dims = DimensionKeys("Data Source=" + olapServer, olapDatabase);
foreach (string dimId in dims.Keys) { #>
<DimensionProcessingConfiguration DatabaseId="<#= olapDatabase #>" ProcessingOption="ProcessUpdate" DimensionId="<#= dimId #>" />
<# } #>
</ProcessingConfigurations>
</AnalysisServicesProcessing>
<AnalysisServicesProcessing Name="ProcessPartitionData" ConnectionName="olap" ProcessingOrder="Parallel">
<ProcessingConfigurations>
<#
Server amoServer = new Server();
amoServer.Connect(olapServer);
Database amoDb = amoServer.Databases[olapDatabase];
foreach (Cube cube in amoDb.Cubes) {
foreach (MeasureGroup mg in cube.MeasureGroups) {
List<PartitionInfo> recent = Partitions(olapServer, olapDatabase, cube.ID, mg.ID);
for (int i = 0; i < recentPartitions && i < recent.Count; i++) { #>
<PartitionProcessingConfiguration DatabaseId="<#= amoDb.ID #>" CubeId="<#= cube.ID #>" MeasureGroupId="<#= mg.ID #>" PartitionId="<#= recent[i].PartitionId #>" ProcessingOption="ProcessData" />
<# }
}
} #>
</ProcessingConfigurations>
</AnalysisServicesProcessing>
<AnalysisServicesProcessing Name="ProcessCubeIndexes" ConnectionName="olap" ProcessingOrder="Parallel">
<ProcessingConfigurations>
<#
foreach (Cube cube in amoDb.Cubes) { #>
<CubeProcessingConfiguration ProcessingOption="ProcessIndexes" DatabaseID="<#= amoDb.ID #>" CubeId="<#= cube.ID #>" />
<# }
amoServer.Disconnect();
#>
</ProcessingConfigurations>
</AnalysisServicesProcessing>
</Tasks>
</Package>
</Packages>
</Biml>

A note of caution: the ProcessingOrder set to Parallel may not always materialize on every BIDS or Biml version, so verify the generated package before deploying it.

Method 2: Sequenced Containers per Measure Group

The second pattern wraps each measure group in its own container with linear constraints between partition tasks. This guarantees that recent partitions are processed in their original order so that when a new partition is added the oldest one drops off the recency window. SSIS containers also give parallelism per cube and per measure group.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<AnalysisServicesConnection Name="olap" ConnectionString="Data Source=<#= olapServer #>;PROVIDER=MSOLAP;Impersonation Level=Impersonate;" Server="<#= olapServer #>" />
</Connections>
<Packages>
<Package Name="OlapProcess_<#= olapDatabase #>" ConstraintMode="Linear">
<Tasks>
<AnalysisServicesProcessing Name="ProcessDimensionsUpdate" ConnectionName="olap" ProcessingOrder="Parallel">
<ProcessingConfigurations>
<#
Dictionary<string, string> dims = DimensionKeys("Data Source=" + olapServer, olapDatabase);
foreach (string dimId in dims.Keys) { #>
<DimensionProcessingConfiguration DatabaseId="<#= olapDatabase #>" ProcessingOption="ProcessUpdate" DimensionId="<#= dimId #>" />
<# } #>
</ProcessingConfigurations>
</AnalysisServicesProcessing>
<Container Name="FactProcessing" ConstraintMode="Parallel">
<Tasks>
<#
Server amoServer = new Server();
amoServer.Connect(olapServer);
Database amoDb = amoServer.Databases[olapDatabase];
foreach (Cube cube in amoDb.Cubes) { #>
<Container Name="Cube_<#= cube.Name #>" ConstraintMode="Parallel">
<Tasks>
<# foreach (MeasureGroup mg in cube.MeasureGroups) { #>
<Container Name="MG_<#= mg.Name #>" ConstraintMode="Linear">
<Tasks>
<#
List<PartitionInfo> recent = Partitions(olapServer, olapDatabase, cube.ID, mg.ID, recentPartitions);
foreach (PartitionInfo p in recent) { #>
<AnalysisServicesProcessing Name="Part_<#= p.PartitionName #>" ConnectionName="olap">
<ProcessingConfigurations>
<PartitionProcessingConfiguration CubeId="<#= cube.ID #>" DatabaseId="<#= amoDb.ID #>" MeasureGroupId="<#= mg.ID #>" PartitionId="<#= p.PartitionId #>" ProcessingOption="ProcessData" />
</ProcessingConfigurations>
</AnalysisServicesProcessing>
<# } #>
</Tasks>
</Container>
<# } #>
</Tasks>
</Container>
<# } #>
</Tasks>
</Container>
</Tasks>
</Package>
</Packages>
</Biml>

Wrap-Up

Building a processing schedule by hand once might be faster, but every additional environment, every renamed measure group, and every added partition pays back the upfront automation cost. Once the AMO walk and the Biml template are in place, regenerating an entire processing package is a ten-second task.