Database people don't generally like date values too much. They're too fiddly. There's god-knows-how-many ways of representing them...and don't even get me started on regional date settings!!
Naturally SSIS contains a datatype that can hold dates. In fact it has a few, and again its important to understand the nuances of using them. Variables can be of type datetime and columns in the pipeline can be of type DT_DATE, DT_DBTIMESTAMP or DT_DBDATE. That's 4 different methods of holding dates. Which one should you use? BOL is not much help here unfortunately so I decided to investigate it for myself.
The first test I ran was to insert identical datetime values into three columns in the pipeline using a script source component; one of type DT_TIME, one of DT_DBTIMESTAMP and one of DT_DBDATE. Here is a screenshot showing the configuration of the output and following that the code to populate the pipeline with a row.


I viewed the results using SSIS's handy Data Viewer feature. Here's what you see:

Straightaway its obvious that DT_DBDATE merely strips off the time precision of the datetime value. That could turn out to very handy.
DT_DATE and DT_DBTIMESTAMP are a bit of a mystery however. They look exactly the same to me. Here's what BOL had to say on the matter:
|
DT_DATE |
A date structure that consists of year, month, day, and hour. |
|
DT_DBDATE |
A date structure that consists of year, month, and day. |
|
DT_DBTIMESTAMP |
A timestamp structure that consists of year, month, hour, minute, second, and fraction. |
Well its correct about DT_DBDATE. I'll concede that the DT_DBTIMESTAMP value does indeed have a fraction in there but the data viewer just isn't showing it - so BOL is correct there too. However, does it look to you as if DT_DATE only consists of year, month, day and hour? Not to me it doesn't.
I enquired of Microsoft what was going on and luckily the query went to Marianne Willumsen who provided the following, very detailed, explanation.
|
The DT_DATE data type is implemented using an 8-byte floating-point number. Days are represented by whole number increments, starting with 30 December 1899, and midnight as time zero. Hour values are expressed as the absolute value of the fractional part of the number. This means that some times DT_DATE can represent minutes, seconds, and even milliseconds as well. However, floating point value cannot represent all real values and there are therefore limits on the range of dates that can be presented in DT_DATE.
On the other hand, DT_DBTIMESTAMP is represented by a structure with individual fields for each of year, month, day, hours, minutes, seconds, and milliseconds an has no limits on ranges of the dates it can present . |
So there you have it. It seems that the biggest single difference between DT_DATE and DT_DBTIMESTAMP is that their internal representation is completely different.
But which one should you use in your packages? At the present time I don't know the answer to this question but I'd hazard a guess that the number representation of a DT_DATE means that it can be processed much quicker than a DT_DBTIMESTAMP however if you require minute and/or second precision in your dates then you have no choice but to go for DT_DBTIMESTAMP.
Marianne also told me that the section of BOL that I have copied above had been changed and will be available in the December 2005 release of BOL.
If you have experience of using the datatypes then please let me know - I'm all ears. I'd be keen to know if DT_DATE really can perform significantly quicker than DT_DBTIMESTAMP in large datasets.
-Jamie