Data Flow Optimization Checklist


I put this together for an SSIS performance presentation I did a while ago, and thought it warranted its own post.

In addition to looking for appropriate Design Patterns, I have a checklist of things to look for when trying to optimize an SSIS Data Flow. This list is by no means extensive, but it contains some of the more common causes of performance issues in the packages I’ve worked with.

OLE DB command operates on a row by row basis. Stage the data and perform the operation in a batch mode using the Execute SQL Task instead.

Learn MERGE, and love it.

Sometimes you do. Most of the time you don’t. If you do need them (i.e. you are loading from a flat file, and need to do a merge), considering maxing out the DefaultBufferMaxRows setting so you can sort more data in memory (sort will keep at most 5 active buffers in memory before it starts swapping to disk).

  • Lookup Transforms
    • Can the lookup be avoided?
    • Are you using the right cache mode?
    • Should you use a Cache Connection Manager?

Lots of related lookup posts can be found here.


Have fun optimizing!

New SSIS Whitepapers from the SQLCAT Team

Two new SSIS SQL Server Technical Articles written by the SQLCAT team have been posted to the MSDN Library. They both focus on hybrid/cloud data movement, but they also contain design patterns and guidance that applies to on-premises data flows as well. The SSIS Operational and Tuning Guide has a great section on designing a package for restartability and performance tuning that all SSIS users will find useful.

SSIS Operational and Tuning Guide

SQL Server Integration Services (SSIS) can be used effectively as a tool for moving data to and from Windows Azure (WA) SQL Database, as part of the total extract, transform, and load (ETL) solution and as part of the data movement solution. SSIS can be used effectively to move data between sources and destinations in the cloud, and in a hybrid scenario between the cloud and on-premise. This paper outlines SSIS best practices for cloud sources and destinations, discusses project planning for SSIS projects whether the project is all in the cloud or involves hybrid data moves, and walks through an example of maximizing performance on a hybrid move by scaling out the data movement.


SSIS for Azure and Hybrid Data Movement

SQL Server Integration Services (SSIS) can be used effectively as a tool for moving data to and from Windows Azure SQL Database, as part of the total extract, transform, and load (ETL) solution and as part of the data movement solution. SSIS can be used effectively to move data between sources and destinations in the cloud, and in a hybrid scenario between the cloud and on-premise. This paper outlines best practices for using SSIS for cloud sources and destinations and for project planning for SSIS projects to be used with Azure or hybrid data moves, and gives an example of maximizing performance on a hybrid move by scaling out the data movement.


Did you know that the DQS team also published a whitepaper on using DQS with AlwaysOn? It’s another interesting read.

PASS Summit 2012 – SSIS Pre-Conference Event

Well, I guess Andy, Tim & I did a good job at PASS last year, as we’ve been invited back to present another SSIS pre-conference event at the 2012 PASS Summit. This time we’ll be joined by the co-authors of our SSIS Design Patterns book, Jessica Moss and Michelle Ufford.


SSIS Design Patterns
Speaker(s): Andy Leonard, Jessica Moss, Matt Masson, Michelle Ufford, Tim Mitchell
In this full-day session, the “SSIS Design Patterns” (Apress, 2012) author team will describe and demonstrate patterns for package execution, package logging, loading flat file and XML sources, loading the cloud, dynamic package generation, SSIS Frameworks, data warehouse ETL, and data flow performance.


We had a great audience and turnout last year (around 120 people if I recall correctly…). This year should be even better…

Should I use the SQL Server Destination?

Short answer:


Long answer:



I was told that some people would like an explanation instead of blindly trusting my advice (weird), so here goes:

In SQL 2005 RTM, the SQL Server Destination was the fastest way to load data into SQL Server with SSIS. Somewhere along the way (in the 2005 SP2 / 2008 timeframe), the performance of the OLE DB Destination (using Fast Load) surpassed the SQL Server Destination in many scenarios. This was due to a couple of different reasons:

  • Multiple improvements were made to the SQL Native Client OLE DB provider
  • Additional checks and constraints were added to the shared memory based loading mechanism used by the SQL Server Destination

I saw results from some performance tests we ran shortly after the 2008 release and was surprised to see that the SQL Server Destination could be 10% faster to 10% slower than loading the same data with the OLE DB Destination. The performance would vary based on the system the tests were run on, the data types involved, and the overall row width. While it does appear that the SQL Server Destination can be faster, it’s just as likely that it will be slower. In most cases, the performance is about the same.

My recommendation is that if you need every bit of performance (a 10% perf increase on a 10 hour load can be significant), try out the SQL Server Destination to see how it works for you. However – keep in mind the following limitations of the SQL Server Destination:

Given these limitations, I recommend using the OLE DB Destination even if you are seeing a performance increase with the SQL Server Destination.

SSIS Performance Design Patterns

Here are the slides (all 177 of them) for the SSIS Performance Design Patterns talk that I’ve delivered at three different pre-conference events over the past month. Each presentation was slightly different, but the core content remained the same. These events included:

The version posted here is from the SQL Bits event.The recording from this event be posted online in the next few weeks – be sure to check the site!


Data Quality Services Performance Best Practices Guide – now available!

The Data Quality Services (DQS) Performance Best Practices Guide (or DQSPDPG for short) is now available on the Microsoft Download Center. It covers hardware and setup considerations, how matching policies will impact your performance, and some best practices when using the DQS Cleansing transform in SSIS. I was also happy to see a section in there about the impact of using advanced DQS functionality, such as Composite Domains, Term-Based Relations, and Reference Data Services. A must read for all DQS users…

Resources for SSIS Performance Best Practices

Simple post today.

Here are a number of links to resources that I reference in my Performance Design Patterns presentation.


You can see me for SSIS Performance Design Pattern training at SQL Saturday #105 in Dublin, SQL Bits X in London, and SQL Saturday #111 in Atlanta.

Queries for Data Flow Component Performance in SQL 2012

The built-in SSIS Catalog reports in SSMS show a breakdown of task and component performance of a package execution when you run with a logging level of “Performance”.

SSIS Catalog package performance report

For a more detailed or customized view, you can query the catalog tables directly. The key table to get data flow performance information is [catalog].[execution_component_phases]. Here are two sample queries – the first returns the total time spent on each component (similar to what is displayed in the performance report), and the second returns the times broken down by phase.


Running these queries returns something like this:


Some notes:

  1. You need to run the package with a Performance logging level to capture this data
  2. You can set a default logging level for the Catalog
  3. The query above determines the execution_id for the latest run – you will need to change this value to look at the results for specific package executions

How Can I Improve the Performance of my Destination Component

Question: I’ve created a custom destination component, but I’m not happy with the performance. I tried using multiple threads, but the pipeline buffer doesn’t allow free traversal. What can I do?


I’d start by profiling the component to make sure you identify the correct bottlenecks. The most common source of problems is accessing the COM interfaces at runtime – make sure that you keep a reference to any object retrieved through IDTSxxx100 members to avoid repeated COM interop calls. It’s a best practice to cache everything you need in the pre-execute phase.

If you’ve determined that the bottleneck is at the destination, and that multiple threads will improve performance, then you’ll need to make a copy of the data in the buffer before passing it to your destination. I did something similar in the old Data Services Destination sample on codeplex (main/DataServices/Src/Destination/Component.cs). The logic was:

  1. Loop through the buffer
  2. Copy row data into local object
  3. Add batch of data to a work item threadpool for insertion (or use the new Task Parallel Library)
  4. When loop completes, wait for all threads/work items to complete before returning from ProcessInput

In the sample I linked to above, a “batch” was a single row (as the destination only supported RBAR at the time), but you can configure this to be whatever batch size makes sense for your destination.

Note that step #4 is important, as the buffer object that is passed into ProcessInput can be destroyed/modified once the function completes. Waiting allows you to redirect error rows if insertions fail at the destination. The downside to waiting is that your destination component can’t start processing a new buffer until you’re “done” with the current one (i.e. have returned from ProcessInput). You can choose not to wait and return right after you’ve read all of the rows, but you must ensure that:

  1. You make a copy (a real copy – not just an object reference) of any data you need from the PipelineBuffer object
  2. When you receive EndOfRowset, make sure you complete all outstanding work items before returning from ProcessInput

The Data Services sample also contains a really useful class called ThreadPoolWait, which makes it easier to wait for threads to complete.

Hope that helps!

Too Many Sources in a Data Flow

This was a recent customer issue that I wanted to share.

The customer was transferring a large amount of data between two tables, without performing any transformations on it. They wanted to run multiple transfer streams in parallel, and created a way to split up their source data into six partitions. In their original design, they had six Source -> Destination combinations in a single Data Flow Task.

six source components in a data flow

This package worked fine in their development environment, but would result in a deadlock when they ran it in production. Looking at the SSIS logs, they could see that the package was hanging in the Pre-Execute phase of the Data Flow. They ran a profiler trace and found that SQL was showing Async Network IO for five sessions, but the sixth was waiting for memory resources (wait type of RESOURCE_SEMAPHORE). Looking at sys.dm_exec_query_resource_semaphores  (shown below) confirmed this.


There were two factors contributing to the deadlock:

  1. The source server did not have enough memory to run six of these queries in parallel
  2. In SSIS, the Data Flow does not start executing until all components have completed the Pre-Execute phase

When the data flow runs, each source component prepares its query with the source server in the pre-execute phase. The server was able to grant enough memory for the first five queries (session_id 57 to 61), but the sixth query (session_id = 63) was told to wait. This prevents the pre-execute phase of the sixth OLE DB Source component from completing. The data flow won’t start running until the pre-execute phase completes for each component, and the source server can’t grant memory until the other queries complete, which results in a deadlocked package.

To resolve this issue, the customer changed the package design pattern to have a single Source -> Destination combination per data flow, with six parallel Data Flow Tasks. With this design, they were still occasionally getting RESOURCE_SEMAPHORE waits, but the waits didn’t last forever, as the other Data Flow tasks were able to complete their queries independently. (They were eventually able to remove the waits all together by optimizing their source query.)

An alternate design (which would be my preference) would have been to have a single Data Flow per package, parameterize the source query, and run six instances of the package in parallel.


Beware of doing too much in a single Data Flow! Unlike Tasks in the control flow, which run independently, the Data Flow can’t start until each component has finished its initialization in the Pre-Execute phase. This is especially important for components that can spend a long time in the Pre-Execute phase, such as a Lookup Transform with Full Cache mode. A Data Flow should only have a single source — unless the data flow is combining data from multiple sources. If your data flow has multiple data paths that do not overlap, then you should consider using separate Data Flow tasks (if they need to be run “together”, put them in the same Sequence Container).