"Every table should have a primary key" is a familiar phrase that I must have used often enough. Of course I am wrong. Every table should have at least one candidate key. A primary key on the other hand is something that isn't formally necessary at all, although it is usually a very good idea. The designation of a primary key is supposed to be essentially a label of convenience - an aide-mémoire applied to one of the candidate keys in a table where such a key has some special significance for the user.
Confusingly, we also have something else called a PRIMARY KEY (note the capitals), which is a type of constraint in SQL. Is a PRIMARY KEY constraint the same thing as a primary key? Not really. The only significance that SQL attaches to a PRIMARY KEY constraint is that it becomes the default for a foreign key reference when no other columns are specified. This has never struck me as an especially useful feature.
If the PRIMARY KEY constraint has any other purpose at all it is to act as a hint that the key may be singled out for special treatment in some way. Microsoft SQL Server for example will use a clustered index for a key that is designated as a PRIMARY KEY - but only if you don't specify otherwise and only if no clustered index already exists on the same table. Exactly why it does this isn't very clear. Keys don't necessarily make a good choice for clustering.
In fact the very existence of the PRIMARY KEY in SQL seems to be an historical accident of some kind. According to author Chris Date the earliest incarnations of SQL didn't have any key constraints and PRIMARY KEY was only later addded to the SQL standards. The designers of the standard obviously took the term from E.F.Codd who invented it, even though Codd's original notion had been abandoned by that time! (Codd originally proposed that foreign keys must only reference one key - the primary key - but that idea was forgotten and ignored because it was widely recognised as a pointless limitation).
Unfortunately the PRIMARY KEY (and perhaps the primary key concept as well to some extent) gets much more attention than it deserves. Many books and articles about SQL are wrong to the extent that they refer again and again to a PRIMARY KEY when they really mean just "a candidate key". Students and even professionals still obsess over the question of "What should be the primary key?" when they really ought to ask "What keys do I need?". Worst of all, keys other than the one chosen to be PRIMARY are sometimes forgotten about and not declared at all.
So what I've often contemplated is whether it would be better in the long run to boycott the PRIMARY KEY constraint altogether and use the logical equivalent - UNIQUE NOT NULL - for all my keys. I have never yet had the courage to do this, but only because I suspect colleagues and clients wouldn't be entirely happy about it. So here's a modest proposal for inclusion in the next ISO SQL standard. Do away with PRIMARY KEY and let's just have keys instead. All keys are equal and we've wasted enough time on the PRIMARY ones already!
Comment Notification
If you would like to receive an email when updates are made to this post, please register here
Subscribe to this post's comments using