There is an ongoing debate among database ‘experts’ regarding the design of a Primary Key. A debate that in my opinion should have been done and dusted a long time ago.
Note: A Primary Key is a piece of data contained in a database Column that uniquely identifies the database Row. This is the same as how a National Insurance number uniquely identifies us to the authorities in the UK, or how a soldiers Service Number uniquely identifies then within the Military. If you need to View, Update or Delete an existing database record then it is essential that you can uniquely identify it.
Two Main Schools of Thought
The first says that the Primary Key should be a valid piece of information in it’s own right – not just an identifier. Like a name for example. In the West we use a Surname which identifies us when amongst other people, most of which will hopefully have a different surname. In situations where that is not true, for example family gatherings, the first name can be used as well as a means of narrowing this down. It can be difficult to build up a unique piece of information using valid information.
The second school of thought acknowledges the problems of the above solution and solves these issues by allowing a non meaningful Unique Identifier whose sole purpose is to be able to identify uniquely within any amount of similar items. This is basically what we have with soldier Service Numbers and National Insurance numbers.
My preference is with the second school of thought and in fact you can easily adopt this strategy with most Database Engines using the Auto Increment option on the Column. This lets the Database Engine itself take care of generating a Unique, Non Reuseable Identifier.
I always use the first Column of my Database Table as my Primary Key and name it:
The XXX depends on the Database Table in question. For example I might have a Table called:
for my Contacts. Each column will be prefixed with con. Therefore my Primary Key for this particular table will be:
Consistency and Structure
All my Database designs use the same structure in order to build consistency, something which is not fully appreciated until you have to work with legacy databases which haven’t been built with consistency, structure or maintainability in mind.
Another example of consistency and structure; the second column of every Database Table I design is always XXX_updguid.
This column contains another identifier, however this one changes with every edit or update of the database record. This is used so that I can find out if the data I am viewing on my screen has actually since been updated elsewhere by someone else. A comparison between the value of the updguid I have in memory and the value of the one stored in the database is all that is needed to determine the validity of the information I am viewing. If the information is stale I have several options I can pursue. This all is part of my Record Locking strategy, covered in another Databasics article soon 🙂
Any questions, comments or thoughts to firstname.lastname@example.org
© Steven Cholerton 2013
Version 1.0.0 – 2nd December 2013