I put this together for an SSIS performance presentation I did a while ago, and thought it warranted its own post. In addition to looking for appropriate Design Patterns, I have a checklist of things to look for when trying to optimize an SSIS Data Flow. This list is by no means extensive, but it … Continued
Generate Data with a Script Component
I use the following code to generate data with a script component. I’ve found this useful in the past to demonstrate how buffers are handled in the data flow, as well as perf/load testing a destination component. This script will automatically generate data based on the columns you’ve defined for the component – no additional … Continued
Automatically Select Input Columns in a Custom Data Flow Component
The following code snippet can be used in a custom data flow component to automatically select all input columns when you attach a path.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
public override void OnInputPathAttached(int inputID) { base.OnInputPathAttached(inputID); for (int i = 0; i < ComponentMetaData.InputCollection.Count; i++) { ComponentMetaData.InputCollection[i].InputColumnCollection.RemoveAll(); IDTSVirtualInput100 input = ComponentMetaData.InputCollection[i].GetVirtualInput(); foreach (IDTSVirtualInputColumn100 vcol in input.VirtualInputColumnCollection) { input.SetUsageType(vcol.LineageID, DTSUsageType.UT_READONLY); } } } |
This is the equivalent to clicking the Select All box in the Advanced Editor.
Queries for Data Flow Component Performance in SQL 2012
The built-in SSIS Catalog reports in SSMS show a breakdown of task and component performance of a package execution when you run with a logging level of “Performance”. For a more detailed or customized view, you can query the catalog tables directly. The key table to get data flow performance information is [catalog].[execution_component_phases]. Here are … Continued
How Can I Improve the Performance of my Destination Component
Question: I’ve created a custom destination component, but I’m not happy with the performance. I tried using multiple threads, but the pipeline buffer doesn’t allow free traversal. What can I do? I’d start by profiling the component to make sure you identify the correct bottlenecks. The most common source of problems is accessing the … Continued
Too Many Sources in a Data Flow
This was a recent customer issue that I wanted to share. The customer was transferring a large amount of data between two tables, without performing any transformations on it. They wanted to run multiple transfer streams in parallel, and created a way to split up their source data into six partitions. In their original design, they had … Continued




