If you've been reading this blog for a while you'll have realised that I think SQL Server Integration Services (SSIS) is a major major improvevement on the product that it is replacing (DTS2000) however I do think that Microsoft have missed a trick somewhere along the line. Allow me to explain.
Imagine a situation where an organisation has a central data repository. Many of my SSIS packages may need to access this data so I am going to be building the same data extraction routine multiple times.
The new SSIS design paradigm means that each package contains 1 control-flow that consists of multiple tasks (data-flows are examples of tasks); so workflow has been seperated from data manipulation whereas in DTS2000 these were both built on the same design surface. The process of building an SSIS package is to instantiate a new control-flow and then build the tasks within that. In other words a task is associated with 1 and only 1 control-flow. It is possible to use that task in other control-flows using copy and paste but of course if you then need to make changes you need to make the changes in 2 places.
This is exhibited in the example from above. I can build a task (probably a data-flow) that extracts data from a table in the data repository and share that around my packages using copy and paste but if the structure of the data repository changes then I have many tasks that I need to change.
It would be better if I could build that task once and instantiate it in each of my control-flows. That way I only have to make changes in one place and the changes would be reflected everywhere that this task is used. This is analogous to a concept that we as software engineers are all familar with ....REUSABILITY!!!! In short I would like to be able to build libraries of reusable tasks that I can then drag and drop into my control-flows.
We can take this idea a little further. Instead of just building reusable tasks let's also build reusable transformations. For example if I have a derived column transformation that extracts the requested URI from a web log record, that is a nice bit of code that I am potentially going to need many times. If I can build that transformation just once and then instantiate it in mutliple data-flows then I save myself a heck of alot of work.
I hope this concept makes sense to people. I know that the SSIS dev team have talked about doing this in the next version. Here's hoping they do it bcause this is numero uno on my wish list for SSIS200?.
-Jamie