Welcome to EMC Consulting Blogs Sign in | Join | Help

SSIS Junkie

Once upon a time this blog was a hive of activity. Now however its pretty lifeless as you can probably tell so if are pining for more of the same you can find me over at http://sqlblog.com/blogs/jamie_thomson. I look forward to seeing you there!

SSIS: MULTICAST bug

This is a heads-up for anyone using the MULTICAST component within the data-flow. There is a bug in it that will raise its head under some very specific circumstances. If you have ever used MULTICAST or are planning to use it (which I'm sure is alot of you because its a commonly used component) then you need to read this.

The bug was discovered by my colleague George Lowe and a repro built by Darren Green who is currently working on the same project as George.

 

For those of you that don't know, the MULTICAST component takes a single input and produces multiple outputs that are identical to the input. In short, it replicates the data. It is typically used when you want a single set of data to go to multiple downstream data paths. Here's what it looks like:

In this simple example the incoming data from the script component is sent to three different destinations.

 

In the example depicted above all three outputs from the MULTICAST all go straight to destinations however that will not always be the required behaviour. It is common that the data may be edited in some or all of the output data-paths before it reaches its destination.

This is where the bug lies. It is possible under certain circumstances that an edit in one of the output data-paths will be reflected in other data-paths thereby providing false data to downstream components. This is because, contrary to how it appears in the SSIS Designer UI, the MULTICAST does not create new data buffers when data passes through it. New buffers are created only when required - i.e. when the data in those buffers is actually changed.

As I said earlier Darren has built a repro. Here's the screenshot. Darren has conveniently provided some useful annotations that go some way to explaining the behaviour:

 

The component called "Derived Column" edits the value in an incoming column. This edit will get reflected in the data viewer between "Derived Column 1" and "Union All".

Of course, there's nothing like actually witnessing the behaviour for yourself so here is Darren's repro package that you can run for yourself. It doesn't require any external connections so is very easy to run.

This is obviously a really dangerous bug as data can be edited and we may never know about it. We have been in touch with Grant Dickinson from the SSIS dev team in Redmond so they are very much aware of it - I'm hoping for a fix in SP2.

In the meantime you'll be glad to know that there IS a workaround. Placing a UNION ALL component between the "Multicast" and "Audit" in the example above will cause that data-path to take a copy of the data early enough that it isn't affected by downstream components in the other data-path. This workaround works because UNION ALL is asynchronous. Any asynchronous component will solve the problem but UNION ALL is the best choice as it has least impact on performance (in fact the impact will be so slow as to virtually negligible - even for large datasets).

Let me know if you have any questions about this.

 

-Jamie

 

 

Published 10 July 2006 16:31 by jamie.thomson
Attachment(s): 20060710MulticastColumnWrite.zip

Comments

 

zpeceno said:

Hi,

Do you already know about a quite similar bug with the Union all component?

It actually misses some rows on some inputs. It seems to be realted to a race condition (dependent on your hardware and current load on the machine). If the same package is run twice on the same data there is no gurantee that all incomming rows (|input1|+|input2|+...+|inputn|) will be output.

Again the workaround was to "prefix" each innput by a dummy Union all.
Apprently this was fixed in SP1, but I haven't yet tested it.

Let's hope it is and that these two bugs don't appear in anybodies project simultaneously!

Thanks
July 12, 2006 14:36
 

Lloyd said:

Hi,

I've encountered the SSIS Multicast bug too.. I was wondering if anyone knows whether the above bug has been fixed in SP2 or not?

Thanks.

Cheers, Lloyd

July 24, 2007 07:30
 

Jason Uithol said:

The workaround mentioned in the article is still a race-condition, albiet with better odds at winning.

I would suggest for the workaround that placing the Union All between the Multicast and the component that does the editing would be better.  This eliminates the race condition, because Union All's output is on a totally separate buffer, and it's that separate buffer that's then being edited.

December 17, 2007 07:21
 

Phil Brammer said:

This does not repro for me in SP2 using Darren's package.

December 28, 2007 17:24
 

jamie.thomson said:

Phil,

Yeah, they fixed it in SP2.

-Jamie

December 28, 2007 17:28
 

ESTEBAN ALVINO said:

yeah!!

thanks jamie

i've started to use Multicast component, so i search in google, and i was redirected here, and yout last comment, make easy my work, if what you say is true about SP2 repair that fix.

Thanks a lot

May 14, 2008 22:54
 

SQL Server SSIS Multicast Transformation | BI Monkey said:

July 6, 2009 12:31
New Comments to this post are disabled

This Blog

Syndication

Powered by Community Server (Personal Edition), by Telligent Systems