Welcome to EMC Consulting Blogs Sign in | Join | Help

SSIS Junkie

SSIS: Is SSIS a professional data integration tool?

Steve Sauer (I'm guessing that is his name) is writing a diploma thesis in which he is comparing various data integration suites including BO Data Integrator, Informatica and various others. And SSIS of course (otherwise why would I be writing this post).

He has been asking alot of interesting questions on the SSIS forum about SSIS's capabilities and I was disappointed to read that he doesn't class SSIS as what he terms a "professional data integration platform". In fairness he does give reasons and he cites SSIS's lack of what he believes are enterprise-class features such as:

  • Impact analysis
  • Data lineage
  • Metadata exchange
  • No out-of-the-box support for enterprise systems

None of which I would dispute (much).

I don't have much experience to speak of in regard to the other tools that Steve is covering so I can't really talk about the areas in which, in Steve's opinion, SSIS is falling short. Nor can I argue to the contrary - much as I would like to. I do have a very strong belief however that SSIS more than fulfills the requirements of a professional data integration tool due to its huge range of features, ease of extensibility and fantastic performance.

I look forward to hopefully being able to read Steve's final thesis submission. In the meantime, I am interested to see what other people think about this subject.

  • Do you think SSIS qualifies as a Professional Data Integration Platform?
  • Do you agree with Steve's assertion that SSIS falls short in some important areas?
  • What, in your opinion, constitues a Professional Data Integration Platform?

Feel free to post comments up here, or write your opinions elsewhere (perhaps on your own blog) and let me know. I'm interested to read people's thoughts around this subject.






Published Tuesday, August 15, 2006 9:10 AM by jamie.thomson



Dan.Perrin said:

I fear Steve may have hit a nerve. I would agree that his view is a little harsh.

Microsoft SQL Server is probably never going to be at the leading edge of “Data Warehouse” technology. SQL Server 2005 RDBMS hasn’t caught Oracle, but it now at least has the necessary features on which an enterprise data warehouse can be implemented. And the same can probably be said about Analysis Services (versus dare I say Oracle Express) and Reporting Services (versus Crystal).

So in the same way, Integration Services isn’t leading the edge, but there are many enterprise data warehouses using it as their data integration tool (and that couldn't be said for DTS). I suspect if total number of implementations were a measure of a “professional data integration” tool then it wouldn't be long until Steve's thesis would have to consider it and I suspect to the cost of several others on his list.
August 17, 2006 8:40 PM

Craig said:

While I agree with some of the PR generated by Microsoft - as they have come a long way between 2000 and 2005 versions, they are still quite a long way from being enterprise class level. We are a DataStage shop and have been looking at a switch to SQL Server (pretty much all year-over-year cost driven) and SSIS requires a tremendous amount of additional work to do the same things we do in DataStage. Of course, DataStage has been around a lot longer - now at version 7.5 - 8 coming out later this year and SSIS is version 1.0.

So - yes - I agree SSIS is not enterprise class - yet. We'll see how long it takes them to get the next major release out and what improvement it makes.
August 21, 2006 2:04 PM

Philip said:

Craig, I too have worked on DataStage (Server edition 7.5.1 as well as Parralel Extender 7.5.1) on various sites for various clients, and even though I am new to SSIS (absolutely greenhorn new - just got my first scripting task going yeah!!!!), I disagree somewhat that it doesnt compare favorably with Datastage.

I have seen the DS tool implemented correctly, and incorrectly, but the one thing that was always an eye opener was the amount of work that was left to the database manager to do. At one particularly severe case, the client's developers sadly chose this tool to simply load "staging" tables, and after procedures to do the actual work. Now, its no fault of datastage as a tool that the implementation is sorely lacking, but after spending huuuuuge amounts of nights without lots of sleep and with copious amounts of caffeine, I have reservations as to where it is placed on the Gartner Magic Quadrant.

Without sounding like I know everything about every ETL tool out there (I for example have no exposure to Informatica), I can tell you that the step up from DTS to SSIS is indeed one that I am relishing, the painful sleepless nights getting ADO enumerators to work within scripting tasks is one I look forward to, and in my humble and modest opinion the boys at Microsoft have done an amazing job with this tool.

Oh, by the way, on routines in datastage, versus the scripting components in SSIS, there simply is no way they can be compared against each other. Datastage use some arbitrary form of basic with the worst compiler I have ever lived through, sort of like code and pray compiler. SSIS on the other hand easily integrates into your choice of CSharp and .NET. Consider for a moment the row based things you can do which is in nooooo way possible in Datastage. On this point alone, SSIS gets my vote.

Now, another point which had me pulling my hair out in Datastage - server edition is completely watered down CRAP (and I do not say this lightly). Before getting exposure to Parralel extender, I hated the tool, purely because a lot of functions you simply cannot do in Server (for example join stages, lookup stages, compare diff stages). Why oh why oh whyyyyyyy would that be the case. Fortunately, after getting the relevant exposure to Parralel extender, there is a lot more that you can do, and parralel processing is sure taken to other levels. But this is fundamentally the reason why so many clients have the RDBMS do all the "funky" work on server editions.

HOWEVER, in saying that, one area that Datastage undoubtedly holds the aces over SSIS is its ability to connect to any datasource, and conversely publish to any datasource. Its something that the limited connections in SQL simply cannot compete with. Parralel processing being the other (although I bet if someone buys a bottle of tequila and sponsors some man hours, there would be a couple of Microsoft boys who could give this statement a great run for money)

My favorite saying - horses for courses, and it really dont matter what the technology is that the client selects, we will work out a way to get the job done...

May 13, 2008 12:21 PM
New Comments to this post are disabled

This Blog


Powered by Community Server (Personal Edition), by Telligent Systems