I have recently come across a SQL Server Iintegration Services performance tuning paper here: http://www.microsoft.com/technet/prodtechnol/sql/2005/ssisperf.mspx that I highly recommend reading for SSIS practitioners. It covers important issues such as:
- Parallelism
- Buffer tuning
- Execution trees
- Isolating bottlenecks
Here are the pertinent points that I took away from reading it:
- I like the categorisation of transformations into row, partially blocking & blocking. I shall try and use that terminology more in this blog.
- A data-flow should be considered as a series of execution trees (of which there may only be one) as defined by the points in the data-flow where new buffers are created.
- Generally speaking, reducing the number of execution trees will cause data-flow performance to improve. (Update: Ashwani Nanda made a very good comment to this point in the comments section below).
- A data-flow buffer cannot exceed 100MB
- SSIS uses the estimated row size, DefaultMaxBufferRows and DefaultMaxBufferSize to derive the number of rows in a buffer at execution time.
- A MULTICAST component does NOT cause a new execution tree to be created
- Identify bottlenecks by investigating each part of your data-flow in isolation. Identify Source speed and Destination speed from which you can derive transformation speed.
- Use the Buffers Spooled performance counter to see if you are running out of physial memory resources.
That's a really useful paper. My thanks to the guys that wrote it!
-Jamie