Welcome to EMC Consulting Blogs Sign in | Join | Help

Third Abnormal Form

Resolution to SQL Server problems. From the front line.

So now you can write an Insert Statement, 6 different ways - Why do you need to know this and why is MERGE the best?

In my previous posting I identified 6 different ways to Insert Data into a Table in SQL Server. Now why do I need to know about these ??

Well in all probability you don't.. However, I work an awful lot of time with Business Intelligence systems and we sometimes need to 'Seed' Dimension Data with Initial Values - You could also say the same with reference data on OLTP systems. These reference data changes may be very infrequent (or never) - But you have to provide the seeding script for the data for initial system set up .. and setting up of test as systems as well. Furthermore, this information is frequently very important to an organisation (it is there master data) so they like to keep copies of it in some sort of source control repository such as TFS, and control and manage it appropriately...  In this case it might be useful to know multiple ways to insert the data into SQL Server, especially as it might be available in many different formats.Text files, Excel etc.. and you want to format it as quickly and accurately as possible..

Method 1 Classic Insert and Method 2 Insert with row constructor

I would suggest, that if you have the data in a spreadsheet or a file, the easiest way of Inserting the data would be to use the new (in SQL Server 2008) INSERT Statement row constructor.. Why ??

Well  if you have source data in the CSV form:

4 , 'Value: 000004'
5 , 'Value: 000005'
6 , 'Value: 000006'

It is far easier to take this and top the data with a single INSERT INTO dbo.InsertTest (ID, TextString) Statement and Tail it with a statement semicolon termination character and then add open parentheses to the start of each line and close parentheses and comma to the end of each line, than adding INSERT INTO dbo.InsertTest (ID, TextString) VALUES (to each row and ');' to the end of each row that the Classic Insert statement takes. Even if you try and do it programmatically or via cut and paste.. I know I have done both and the INSERT Statement row constructor makes it so much easier...

I would suggest the 3rd method.. Method 3 Insert with Select Values Clause has no real practical use, (except as just another way of doing something) - It is a bit difficult to code without looking in up on BOL and I have never used it in 'anger' and I suspect not many others have used it either... If you can think of a use, then do get in touch..

I would strongly suggest that you should use Method 4 Merge Statement for inserting this reference data.. Why?? well it has one great advantage. If you craft your Merge Statement correctly, you will always ONLY have to keep 1 master copy of your reference data, and add, update or delete data in it this script accordingly. Because, when you make any change to the reference data, you can pass all the data in and the Merge Statement and it only works on the data that has changed. You do not have to worry about creating and managing separate delta scripts for INSERTs, UPDATES and DELETES and then ensuring a consolidated script is maintained in your Source Control System. Let me show you what I mean..

With our test table

 

CREATE TABLE dbo.InsertTest

(

ID INT NOT NULL,

TextString NVARCHAR (100) NOT NULL,

PRIMARY KEY CLUSTERED

(

ID

)

);

-- Initialisation

MERGE dbo.InsertTest AS TARGET

USING (SELECT T.ID, T.TextString

FROM

(

VALUES

(1 , 'Value: 00001'),

(2 , 'Value: 00002'),

(3 , 'Value: 00003')

) AS T (ID, TextString)) AS SOURCE

ON (TARGET.ID = SOURCE.ID)

WHEN MATCHED THEN

UPDATE SET TARGET.TextString = SOURCE.TextString

WHEN NOT MATCHED BY TARGET THEN

INSERT (ID, TextString)

VALUES (SOURCE.ID, SOURCE.TextString)

WHEN NOT MATCHED BY SOURCE

THEN DELETE;

 

SELECT * FROM dbo.InsertTest;

ID TextString
1 Value: 00001
2 Value: 00002
3 Value: 00003

-- Delta Update: Add a new  Row, delete existing row, change decription

MERGE dbo.InsertTest AS TARGET

USING (SELECT T.ID, T.TextString

FROM

(

VALUES

(1 , 'Value: 00001'),

--(2 , 'Value: 00002'), -- Deleted Row

(3 , 'Value: 0003A'), -- Updated Row

(4 , 'Value: 00004') -- Inserted Row

) AS T (ID, TextString)) AS SOURCE

ON (TARGET.ID = SOURCE.ID)

WHEN MATCHED THEN

UPDATE SET TARGET.TextString = SOURCE.TextString

WHEN NOT MATCHED BY TARGET THEN

INSERT (ID, TextString)

VALUES (SOURCE.ID, SOURCE.TextString)

WHEN NOT MATCHED BY SOURCE

THEN DELETE;

 

SELECT * FROM dbo.InsertTest;

ID TextString
1 Value: 00001
3 Value: 0003A
4 Value: 00004

 

As you can see you ONLY need to every keep 1 'current' copy of the Master Reference Data Script when you use the Merge Statement. This is so much easier to control and manage

 

See also

 

http://consultingblogs.emc.com/paulmcmillan/archive/2011/07/06/how-many-different-ways-can-you-write-a-sql-insert-statement-1-2-3-6.aspx

http://consultingblogs.emc.com/paulmcmillan/archive/2011/07/06/a-gotcha-with-the-insert-statement-row-constructor.aspx

http://consultingblogs.emc.com/paulmcmillan/archive/2011/07/06/so-you-may-have-many-different-ways-of-inserting-data-but-which-is-the-fastest.aspx

 

Published 06 July 2011 13:52 by Paul.McMillan

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Kristian Wedberg said:

Very useful. One gotcha is the corner case where you have enabled foreign keys between two reference data tables - the changes would need to be in the order of:

1: Insert/Update into master table

2: Insert/Update/Delete from detail table

3: Delete from master table

In effect, the master table changes get split into two queries. Or you could temporarily disable foreign keys. Or you could (sometimes) do cascading deletes.

Thanks!

Kristian

August 26, 2011 22:17

Leave a Comment

(required) 
(optional)
(required) 
Submit
Powered by Community Server (Personal Edition), by Telligent Systems