Welcome to EMC Consulting Blogs Sign in | Join | Help

SSIS Junkie

SSIS: Package Naming Conventions

In this blog's history I have suggested naming conventions for tasks and components and just lately I have been experimenting with a naming convention for packages as well. I've been pretty pleased with it and today a colleague commented on its usefulness so I thought I would publish it here and see what you, the SSIS community, thought of it.

The thinking behind from it stems from the current development culture of defining namespaces for objects - a practice that is particularly prevelent in .Net development it seems. I realised that there wasn't really a way of defining a namespace for a SSIS package so I figured, why not manufacture a fake one instead.

The premise is simple. Each package name begins with the name of the project in which it resides at design-time. It ends with a description of that package's purpose. These two parts are then joined using the familiar dot notation. All parts of the name use CamelCase. The description of a package name follows a <Verb><Noun> nomenclature.

I can illustrate this better with a picture rather than words:

As you can the name of the projects inherit part of their names from the name of the solution and each package inherits part of its name from its parent project. The last part of the package name contains a verb followed by a noun.

Hopefully you can see the concept that I am driving at here. I like this approach for a few reasons:

  • The package names gives a clear indication of what the package does.
  • The casing adheres to common development practices
  • It helps to organise your packages at design-time into projects that serve a similar purpose
  • It gives a visual clue as to which packages do similar work instead of what a package does in isolation
  • It provides some context in which the package operates
  • It ensures that no two packages have the same name. This can be important when you deploy the packages to your test/QA/live environments - particularly if they are deployed to the same place..

Of course, there is an obvious disadvantage as well. The names are fairly long and unweildy. Anyone used to .Net development shouldn't have a problem with that however because we're not doing anything particularly different to how compiled .Net assemblies are named for placing in the GAC.

Also, another disadvantage is that the 'Name' property of a package (which is different to the name of the .dtsx file) cannot contain periods - they get replaced with a space. Hence, the name of the .dtsx file does not fully marry up with the name of the package container. I consider this to be a minor and inconsequential issue however.

 

So that is it. There's not much to it really, is there?

  • Does anyone think this is a useful technique?
  • Do you think you might employ this naming convention on your own projects?
  • Do you have any alternate suggestions for package naming conventions?

Let me know what you think. I'm really interested in feedback. 

-Jamie

UPDATE 2006-11-17

I received an email in response to this post from Howard Wild. Howard shared with me his own package naming conventions and he has kindly allowed me to reproduce it below:

Here's what a typical package looks like - SLS_A_D_Dim_SalesRep, OK I know it's long winded but I also know it's a sales related package, it runs automatically on a daily basis and it generates the sales representatives dimension. On top of that there are slight deviations for import and export jobs eg TRN_A_W_ST2_Rch_JourneyHist, This ones a Transport related package, is automated weekly and brings in a base table form our Reach system, the table is the Journey History table.

We also use a very similar convention for Jobs.

We decided to choose this option because it best fits the way we work, being a Datawarehouse we are very business orientated and chose to group by business area above everything else. We acknowledge that in some circumstances business areas overlap and that some packages will have deal with multiple business areas and in those circumstances we tend to go with the original service requestor ie, the area of the business the person who requested the job works in.

 

Published 08 November 2006 21:55 by jamie.thomson

Comments

 

SSIS Junkie : SSIS: Suggested Best Practices and naming conventions said:

November 9, 2006 06:05
 

Ravimotha said:

Naming conventions are tricky.

There are some simple yet effective rules which  are applicble in lots of different places, and across lots of different langauages.

Personally I  am a great believer in the Roy Walker school, of say what you see.

sp_GetThingforFunction

sp_updateThingForFunction

tbl_Thing or tblThing

the title of a stored procedure / block of functionality, should always tell you what it is doing

This way you should need to fill up code with Kludge comments, it also means that you can put meaningful comments in the code with out file bloat.

regards

ravi

November 20, 2006 16:31
 

Thomas Kejser said:

Being old school - I like to think of BI projects as both a development and an infrastructure task. Most of my customer have organized themselves so that developers and infrastructure are two different departments. Sometimes the infrastructure guys are even outsourced.

This also mean I favor of splitting the development LAYOUT of a BI projekt from the infrastructure LAYOUT and map between the two.

I believe solutions and projects are a developer specific grouping - not something that is relevant to the infrastructure guys. When communicating with the business users - both the infrastructure and development department communication in terms of dimension, facts and source system - not in terms of projects and solutions. Thus, the most important thing about the naming convention for SSIS packages is to enable error tracking - not to reflect the development layout. On a side note - if you reuse a package in another solution - will you then rename it?

Hence, I think the naming based on the solution layout is somewhat arbitrary. One thing we strive to achieve is to have a naming convention that, in job logs, point of WHAT went wrong.

We use the following naming convention:

For jobs that extract data from source system:

Stage_{SourceSystemName}_[Dim| Fact][_Datasubset]

Dim og fact postfix depending on whether its dimension or fact data we are extracting in the package. If a source system is down, the name of the package that failedd will quickly tell us which one it was. If a package extracts both dimension and fact data we omit the postfix. If the staging package only extracts part of the data from the source, we use the datasubset to name which part

For jobs that transform and load data (we do both in the same package):

{Dim | Fact}_{DimensionName | FactName}_[SourceSystem]

Source system is added to the name if more than one package must be run per dimension (if you have two ERP systems for example)

Load_{Function}

Example. We have an ERP (say: SAP) system that loads dimension and facts to our wearehouse. Futhermore, we have a CRM (Lets call it MyCRM) system containing additional customer dataWe would have:

Stage_SAP_Dim.dtsx -- Extracts data from SAP used dimension

Stage_SAP_Fact.dtsx -- Extracts data from SAP used for facts

Stage_MyCRM_Dim_Customer.dtxs -- extracts only customer data from the CRM system. Note: Customer is an actual dimension seens my end users - we name the package to reflect what business object it loads

Dim_Customer.dtsx --Load customer data by merging both MyCRM and SAP

Fact_Sales.dtsx -- load all staged sales data

Fact_Purchses_SAP.dtsx -- Loads purchase data, but only the data from SAP

Load_SAP -- Updates data SAP into the warehouse (calling SAP loaders, and Dim `fact loaders)

Load_MyCRM -- updates the warehouse with MyCRM data (calling MyCRM loader and Dim_Customer)

Load_Warehouse -- Updates the warehouse with all data (calling Load_SAP and Load_MyCRM)

With such names, both develoopers, business users and the infrastructure department can quickly locate a load problem.

January 11, 2007 11:21
 

stan said:

i'd be wary of building "run method" and or "frequency" into a package name given the modular/reusable nature of this product.  Has anyone given any thought to adding a version number to the name of a pkg in those cases where functionality would be described identically but flavor might be different?

February 6, 2007 22:50
 

jamie.thomson said:

Stan,

Its not a bad idea. The only thing I'd be wary of is that a package inherently has a version (major/minor/build) number anyway and you wouldn't want to confuse the two.

-Jamie

February 6, 2007 23:09
 

stan said:

gotta ask the obvious jamie...can managed code calling pkgs or a master pkg calling subpackages or subpackages plugged into a master call/use two (or more) different versions of the same (sub)package?  If so, how would the plumbing work?  Beside annotation, can the design pane make a visual distinction between 2 or more subpackage versions obvious?  Can I choose major, minor and build version names myself or are they assigned?    

February 7, 2007 14:25
 

jamie.thomson said:

Stan,

Good questions. Short answer is no, you can't call different versions of the same package - which is where your suggestion becomes very useful. Packages are called by name, nothing else.

Yes, you can choose the major/minor/build numbers yourself. Be aware that the build number gets incremented every time you save the package. Major/minor numbers never change unless you change them yourself.

I have previously asked for the ability to make the text in an annotation be set by an expression - this would enable you to display the current major/minor/build number on the design surface. I doubt we'll ever get this though.

-Jamie

February 7, 2007 18:29
 

SSIS Junkie said:

Recently on this blog I mentioned in passing that I have been working for some considerable time now

August 6, 2007 20:57
 

Ryan said:

Hi,

I was curious about naming the packages with Periods in the name. Unfortunately, I cannot find the article I came across last week, but it said to not use periods in the package name as SQL make get "confused" and think it is a database object being referenced by object name (Server.Database.Schema.Object). Has that ever been an issue for you? I'm fairly new to SSIS and did not think this would be an issue, but I thought I'd ask. :)

-Ryan

October 10, 2007 19:46
 

jamie.thomson said:

Ryan,

Interesting thought. I've never come across this as a problem but then, I never store my packages in SQL Server so it could never be a problem for me.

If you have any experiences on this that you'd be willing to share then I'm all ears.

-Jamie

October 10, 2007 20:00
 

Ryan said:

Figures I'd find the article after I posted. :)

It appears I misread the blog and thought it was referring to an issue parsing the package name. But, when I reread the post it seems that they mean it is confusing for someone reading the code.

Here it is the link: http://blogs.msdn.com/sqllive/default.aspx

-Ryan

October 10, 2007 20:26
 

jamie.thomson said:

I fail to see how a package could be confused with a database object but all advice and experience is gratefully received.

-Jamie

October 10, 2007 20:39
 

David R Buckingham said:

I implemented a naming convention shortly after I made the first 5 or 6 IS packages.  Since they strictly load data (normalizing multiple clients data into a standardized database structure), they follow the format of <DatabaseName>.FileImport.<ClientName>.<SourceFileType>.dtsx

SourceFileType is not XML, Excel, etc., but rather the business owner's name for the file type (i.e., MedicalClaims = medical claims, MemberDemographics = member demographics, ProviderList = provider list).

There are many benefits to implementing a standard naming convention, both at the package level and the component level.

November 16, 2007 20:24
 

Rupert Todd said:

Thanks a lot.  

I will be implementing a pilot SSIS project at my client (a large corporation). Because no one here appears to know SSIS, I have become the guru, even though I have never worked with SSIS before!  

Your naming conventions seem perfect for my project and I am going to recommend that we adopt them.

Thanks again!

January 14, 2008 20:12
 

Geir Morten Allum's MS application platform hvor, hva, når, hvorfor, osv... said:

Det viktig at man tidlig i prosjektene tar hansyn til navnestandarder og får dette på plass i prosjekthåndbøker

April 30, 2008 15:44
 

Rusty said:

Jamie,

How do you set the VersionMajor and VersionMinor system variables?

Thanks!

July 4, 2008 02:13
 

jamie.thomson said:

Hi Rusty,

The values in those variables are dependant on the VersionMajor & VersionMinor properties of the package.

-Jamie

July 4, 2008 09:30
 

Strate SQL said:

Jamie Thompson posted some best practices and naming conventions for SQL Server Integration Services a bit back. Interesting ideas and for some reason I hadn't ever really thought about naming conventions within SSIS until I read these articles. ...

November 14, 2008 03:57
 

Ronie said:

Hi,

We had an issue with the naming conventions lately in my office...

I previously practiced and suggested the idea of using it in the following way...

STG_[SourceSystemName]_[Dim| Fact TableName].dtsx

Ex: STG_FDR_Currency.dtsx

but my collegue insists in using the following way:

STG [SourceSystemName] [Dim| Fact TableName].dtsx

Ex: STG FDR Currency.dtsx

My collegue wants to have spaces in between which I strongly opposed, but i don't know how to explain it...as he being my superior i am not able to convince him..can you guys please help me by telling which would be a better practice of the two and why....?  

November 21, 2008 05:17
 

jamie.thomson said:

There is no better practice. Either will work fine.

November 21, 2008 11:31
New Comments to this post are disabled

This Blog

Syndication

Powered by Community Server (Personal Edition), by Telligent Systems