SQL Server Integration Services (SSIS) contains some really useful logging procedures but as with most things in SSIS, it is extensible. There are 2 methods of extending the logging capability of SSIS:
- Build a custom log provider
- Use event handlers
I am going to demonstrate the second of these methods - using SSIS event handlers to log to a custom logging table.
Step 1 - Create the log table
The script below will create a table into which we are going to log events from our SSIS packages.
CREATE TABLE SSISLog (|
EventID int identity(1,1) NOT NULL
, EventType varchar(20) NOT NULL --e.g. OnPostExecute, OnError
, PackageName varchar(50) NOT NULL
, TaskName varchar(50) NOT NULL --Task that raises the event
, EventCode int NULL --Some events (e.g. OnError) will provide a code
, EventDescription varchar(1000) --Some events (e.g. OnInformation) will provide an event description
, PackageDuration int --How long has the package been executing for
, ContainerDuration int --How long has the task that raised the event been executing for
, InsertCount int --How many rows were inserted somewhere by the task
, UpdateCount int --How many rows were updated by the task
, DeleteCount int --How many rows were deleted by the task
, Host varchar(50) --Which machine was the package running on
, constraint PK_SSISLog PRIMARY KEY CLUSTERED (EventID DESC)
This table is by no means exhaustive. There is a multitude of other things that we could log should we want but for now, this will do for demo purposes.
Some things to notice here. All of the information that we are going to log will come straight out of SSIS. Most of the information is provided by system variables however in the case of InsertCount, UpdateCount & DeleteCount we are going to have to do a bit of work in the package to get these populated as you will see later.
Step 2 - Build the logging functionality
As I said we are going to use event handlers to demonstrate the custom logging ability of SSIS. The event handlers provided with SSIS (N.B. The event handlers themselves are also extensible) are:
For the purposes of logging the most important ones are OnPostExecute, OnError & OnWarning and of these 3 the most important one of all is probably OnPostExecute because that enables us to put a message into our log table after each task in our package has completed. Hence, I will use the OnPostExecute event handler for demo purposes.
The relevance of the container hierarchy
It is important to realise the role that the container hierarchy of a SSIS package plays here. Understanding containers is central to understanding SSIS and they are especially relevant for event handlers. Event handlers can be attached to any container in the package and that event handler will catch all events raised by that container and any child containers of that container. Hence, we can attach an event handler to the package (which is in itself a container) and this one event handler will catch all events raised of that event type by every container in the package. This is fantastically powerful because it means we do not have to build event handlers for each task in the package.
Note that it is possible for a container to "opt out" of having its events captured by an event handler so if, say, you had a sequence container for which you didn't see the relevance of capturing events, you can simply switch them off using the sequence container's DisableEventHandlers property.
Similarly, if you wanted only certain events of that sequence task to be captured by an event handler, you could control this using the System::Propogate variable - but I'll leave further discussion of that for another day!
I have implemented the insertions to the log table using an INSERT statement in an Execute SQL Task. Using the venerable property expressions to generate the SQLStatementSource property of the Execute SQL Task at runtime we can dynamically log the execution duration, host, task that raised the event and a plethora of other useful information at our disposal. Here is a screenshot of our property expression that does this. This where "the magic" happens if you like :)
Some things that you should note at this point:
- We are mostly using system variables to get the information we're interested in. In other words, most of what you want is provided for you, you just have to pick it up and use it.
- We are type casting all all numeric values into strings. At frst glance that may seem an odd thing to do but remember that all we're building here is a value for the SQLStatementSource property of the Execute SQL Task. This property is of type string. We're concatenating various parts of our dynamic SQL statement and we want to be concatenating strings, not numbers.
- PackageDuration and ContainerDuration are generated using the DATEDIFF function which returns the number of seconds between System::StartTime/System::ContainerStartTime and the current datetime returned by GETDATE().
- The GETDATE() function used herein is an SSIS function, NOT the T-SQL function of the same name.
- EventCode & EventDescription are populated with a zero and an empty string respectively. This is because they are not relevant for the OnPostExecute event and are more likely to be used with OnError, OnWarning & OnInformation.
For clarity, here is a screenshot showing the property expression for the OnError event.
As you can see, here we are able to use the System::ErrorCode & System::ErrorDescription variables to pick up information about the error that has been raised.
There's one other important thing we need to do in our OnPostExecute event handler. You'll notice we are using 3 variables to track the number of rows processed by our tasks (more on this later). It is important that we reset these values to zero after every task otherwise the next task may report the same values. We do this using a script task. Here's the code to reset them (its very easy - just don't forget to add the variables to the ReadWriteVariables property of the script task).
Here's what your OnPostExecute event handler should now look like like.
Step 3 - Lets test it!
OK, we've got our logging mechanism constructed, now lets use it! I've constucted a simple table with some data in it to which we are going to apply some inserts, updates & deletes. Here's the script to create the table and populate it with some data.
|CREATE TABLE ExampleCustomerTable (|
CustomerID int identity(1,1)
, [Name] varchar(30)
, Age int
, DateOfBirth datetime
GOINSERT ExampleCustomerTable ([Name], Age, DateOfBirth) VALUES ('Jamie', 26, '19770621')
INSERT ExampleCustomerTable ([Name], Age, DateOfBirth) VALUES ('Helen', 28, '19770520')
INSERT ExampleCustomerTable ([Name], Age, DateOfBirth) VALUES ('Jon', 27, '19770502')
INSERT ExampleCustomerTable ([Name], Age, DateOfBirth) VALUES ('Gerald', 66, '19390215')
INSERT ExampleCustomerTable ([Name], Age, DateOfBirth) VALUES ('Samantha', 33, '19741105')
And here's our data-flow that does some inserts, updates and deletes based on certain criteria (not important what that criteria is for now).
As I intimated to earlier, we have to populate the InsertCount, UpdateCount & DeleteCount variables from within our data-flow. That we do using the Rowcount transformation. You can see in the above graphic that we have 3 Rowcount transformations, one for each of the the 3 variables.
To test our OnError event handler I needed a task that raised an error. The simplest way of doing that is to have a script task that returns failure instead of success.
Public Class ScriptMain
Public Sub Main()
Dts.TaskResult = Dts.Results.Failure
OK then, that's everything. Here is the 2 tasks in our package. Note the captions in the tasks - these are the names of the tasks and you're going to be seeing them very soon.
Executing the package gives us the following in our logging table:
And that's it! Things to note about the log table here:
- We have successfully got our rowcounts in there
- The name of the task (actually the container) raising each event is in there as the machine that I ran it on (CGOJTHOMSON)
- The error message from our script task has gotten logged successfully
- The "Raise an error" task raised an OnPostExecute event even though it also raised an OnError event
- The package itself raised an OnPostExecute event (EventID=4)
[The only problem at the moment is that ContainerDuration is not getting populated but I think that's a seperate issue and doesn't affect us here. I'll take it up with the dev team later.]
Let's round up what we've done:
- Created a logging table with our own customised schema
- Devised a method of logging to this table using the Execute SQL Task, property expressions and event handlers.
- Demo'd the use of the logging event handlers by building a simple package containing a data-flow.
- Demo'd the use of the Rowcount component to generate custom logging data that isn't available through the system variables.
You can download the package that I've built to demo this from here. The only thing you'll have to do to get it working is change the server name of the "LogDB" connection manager. All the scripts that you need to create the tables and data are somewhere to be found in this blog post. If there's any problems - let me know.
I hope this serves as a useful demo of how to incorporate custom logging into your systems.
N.B. This was built on IDW15, a beta version of SSIS that came out in June 2005.
UPDATE, 8th June 2008: In SSIS 2008 we have the ability to capture much richer contextual information than in SSIS 2005. Head for Using @[System::SourceParentGUID] for custom logging to find out more.