Data Flow Optimization Checklist

image

I put this together for an SSIS performance presentation I did a while ago, and thought it warranted its own post.

In addition to looking for appropriate Design Patterns, I have a checklist of things to look for when trying to optimize an SSIS Data Flow. This list is by no means extensive, but it contains some of the more common causes of performance issues in the packages I’ve worked with.

OLE DB command operates on a row by row basis. Stage the data and perform the operation in a batch mode using the Execute SQL Task instead.

Learn MERGE, and love it.

Sometimes you do. Most of the time you don’t. If you do need them (i.e. you are loading from a flat file, and need to do a merge), considering maxing out the DefaultBufferMaxRows setting so you can sort more data in memory (sort will keep at most 5 active buffers in memory before it starts swapping to disk).

  • Lookup Transforms
    • Can the lookup be avoided?
    • Are you using the right cache mode?
    • Should you use a Cache Connection Manager?

Lots of related lookup posts can be found here.

 

Have fun optimizing!

1 thought on “Data Flow Optimization Checklist

  1. I have a Turbo Dataflow solution for loading a large .csv file (>= 10 GB).
    FlatFile Source -> Derived Column Transform -> Script Component Transform -> multiple OLE DB Destinations (will provide detail in another post)

Comments are closed.