Data Flow

Data Flow Optimization Checklist

Data Flow Optimization Checklist

I put this together for an SSIS performance presentation I did a while ago, and thought it warranted its own post. In addition to looking for appropriate Design Patterns, I have a checklist of things to look for when trying to optimize an SSIS Data Flow. This list is by no means extensive, but it … Continued

Generate Data with a Script Component

Generate Data with a Script Component

I use the following code to generate data with a script component. I’ve found this useful in the past to demonstrate how buffers are handled in the data flow, as well as perf/load testing a destination component. This script will automatically generate data based on the columns you’ve defined for the component – no additional … Continued

Automatically Select Input Columns in a Custom Data Flow Component

Automatically Select Input Columns in a Custom Data Flow Component

The following code snippet can be used in a custom data flow component to automatically select all input columns when you attach a path.

This is the equivalent to clicking the Select All box in the Advanced Editor.  

Queries for Data Flow Component Performance in SQL 2012

Queries for Data Flow Component Performance in SQL 2012

The built-in SSIS Catalog reports in SSMS show a breakdown of task and component performance of a package execution when you run with a logging level of “Performance”. For a more detailed or customized view, you can query the catalog tables directly. The key table to get data flow performance information is [catalog].[execution_component_phases]. Here are … Continued

How Can I Improve the Performance of my Destination Component

Question: I’ve created a custom destination component, but I’m not happy with the performance. I tried using multiple threads, but the pipeline buffer doesn’t allow free traversal. What can I do?   I’d start by profiling the component to make sure you identify the correct bottlenecks. The most common source of problems is accessing the … Continued

Too Many Sources in a Data Flow

Too Many Sources in a Data Flow

This was a recent customer issue that I wanted to share. The customer was transferring a large amount of data between two tables, without performing any transformations on it. They wanted to run multiple transfer streams in parallel, and created a way to split up their source data into six partitions. In their original design, they had … Continued