An interesting discussion today on the SSIS Forum made me realise that perhaps not all SSIS developers are aware of the OnPipelineRowsSent event in SSIS and how valuable it can be, certainly when debugging data-flows. This isn't all that surprising because Books Online is very sparse in this area. Hence I thought it would make a useful blog topic.
OnPipelineRowsSent is a type of event in SSIS which means the information that it provides can be logged by whatever log provider you happen to be using. In order to understand the information that OnPipelineRowsSent provides it is critical to understand the buffer architecture of the SSIS data-flow.
To explain that VERY simply, a buffer is an area of memory that contains rows of data as they pass through the pipeline. Each data-path in the data-flow will contain one of more buffers and each of those buffers (except for the last one) will contain the same number of rows. If you want to understand more about the buffer architecture of the data-flow then Kirk Haselden's book has a chapter devoted wholly to it.
Here is what Books Online DOES say about OnPipelineRowsSent:
OnPipelineRowsSent | Reports the number of rows provided to a component input by a call to the ProcessInput method. The log entry includes the component name. |
In slightly plainer english that means that for every buffer of data that is consumed by a component, you get a log entry. So, if you have (e.g.) a Derived Column Component that receives 105000 rows and the size of the buffer is 10000 rows, you'll get 11 OnPipelineRowsSent log entries for that component (there will only be 5000 rows in the last buffer). Basically it enables you to know how many rows are passing through each data-path in the data-flow.
The available information is the same as you get for any eventhandler. i.e.
- event
- computer
- operator
- source
- sourceid
- executionid
- starttime
- endtime
- datacode
- databytes
- message
and here is a portion of a logfile containing those records:
OnPipelineRowsSent,CGOJATHOMSON,INT\Jamie.Thomson,Data Flow Task,{EB25C41A-EFE2-4902-9610-99C0309A3B59},{33B4926F-54A9-4E00-9B74-92720A2786E0},08/03/2007 20:49:56,08/03/2007 20:49:56,0,0x,Rows were provided to a data flow component as input. : : 1228 : Merge Join Output : 1225 : Union All 1 : 1226 : Union All Input 1 : 9936
OnPipelineRowsSent,CGOJATHOMSON,INT\Jamie.Thomson,Data Flow Task,{EB25C41A-EFE2-4902-9610-99C0309A3B59},{33B4926F-54A9-4E00-9B74-92720A2786E0},08/03/2007 20:49:56,08/03/2007 20:49:56,0,0x,Rows were provided to a data flow component as input. : : 1475 : Union All Output 1 : 1470 : Sort 3 : 1471 : Sort Input : 9936
OnPipelineRowsSent,CGOJATHOMSON,INT\Jamie.Thomson,Data Flow Task,{EB25C41A-EFE2-4902-9610-99C0309A3B59},{33B4926F-54A9-4E00-9B74-92720A2786E0},08/03/2007 20:49:56,08/03/2007 20:49:56,0,0x,Rows were provided to a data flow component as input. : : 1228 : Merge Join Output : 1225 : Union All 1 : 1226 : Union All Input 1 : 2085
OnPipelineRowsSent,CGOJATHOMSON,INT\Jamie.Thomson,Data Flow Task,{EB25C41A-EFE2-4902-9610-99C0309A3B59},{33B4926F-54A9-4E00-9B74-92720A2786E0},08/03/2007 20:49:56,08/03/2007 20:49:56,0,0x,Rows were provided to a data flow component as input. : : 1475 : Union All Output 1 : 1470 : Sort 3 : 1471 : Sort Input : 9936
OnPipelineRowsSent,CGOJATHOMSON,INT\Jamie.Thomson,Data Flow Task,{EB25C41A-EFE2-4902-9610-99C0309A3B59},{33B4926F-54A9-4E00-9B74-92720A2786E0},08/03/2007 20:49:56,08/03/2007 20:49:56,0,0x,Rows were provided to a data flow component as input. : : 1475 : Union All Output 1 : 1470 : Sort 3 : 1471 : Sort Input : 2085
OnPipelineRowsSent,CGOJATHOMSON,INT\Jamie.Thomson,Data Flow Task,{EB25C41A-EFE2-4902-9610-99C0309A3B59},{33B4926F-54A9-4E00-9B74-92720A2786E0},08/03/2007 20:49:56,08/03/2007 20:49:56,0,0x,Rows were provided to a data flow component as input. : : 2121 : Sort Output : 2113 : Merge Join 2 : 2114 : Merge Join Left Input : 9936
The important stuff is what comes in the last field, the message so let's break that down. Here is an example message:
Rows were provided to a data flow component as input. : : 1030 : OLE DB Source Output : 1025 : Sort 2 : 1026 : Sort Input : 9972
And here's what it is constituted of:
- Rows were provided to a data flow component as input. - That's consistent in each message, so to be honest they could have left it out
1030 - ID of the data-path providing the buffer
OLE DB Source Output - Name of the data-path providing the buffer
1025 - ID of the component receiving the buffer
Sort 2 - Name of the component receiving the buffer
1026 - ID of the input receiving the buffer
Sort Input - Name of the input receiving the buffer
9972 - Number of rows in the buffer
As you can see, this information will enable you to determine exactly how many rows are output from each component. Thus, if you are not receiving the expected number of rows at a destination this event will help you to deduce where you are losing them from.
You may not know this but you actually encounter OnPipelineRowsSent every time you run a data-flow within the SSIS Designer in BIDS. Notice how the designer shows the number of rows passing through the pipeline and through each component:

How does the designer know how many rows are passing through the data-flow? Simple! Its because the SSIS Designer consumes the OnPipelineRowsSent event that the executing package "throws" up and then presents that information visually as the numbers that you see flashing in front of your eyes as the data-flow executes.
I think that just about covers the simple stuff around OnPipelineRowsSent. If you have any questions then post them here as a comment.
-Jamie