Learning about execution plans. (part 2)

by Allan Svelmøe Hansen January 18, 2008 11:15

Reflecting over the workings of indexes I learned the other day, I started to draw some conclusions. Some might be premature, but well – numbers seems to back it up.

Firstly, SQL Server 2005 automatically builds an index when you assign primary keys. If you do not do anything, it will by default be a clustered index, because primary keys usually are the prime candidate for clustered indexes, as those are the values you assumed would use to get data via. However, you can specify that an index should be non-clustered, but I’ve not found a way to make it non-unique (which is why you can’t have duplicate values in your PK), and I do not know whether that is at all possible.

Anyways, because of this, it starts to matter which order you create your columns in, in the table. That is because I’ve not yet found a way to change the order of columns in a primary key index, whereas in a normal created index, I can shift the order of the columns to my desire.

This means, that unless you analyze the data and thus know how you'll use it, you actually risk ending up with ineffective indexes, which can and will hurt your performance.
The example I showed in my part one article is a prime example of this.
If looking at the table definition, I had TestTwoID, TestTwoValue and TestOneID (the FK) in that order. In that example, I solved my inefficiency by making a custom clustered index (after I made the PK index non-clustered).
However, suppose for a moment that it had been real life data, then an analyzing of the data could have told me that I’d properly use the FK (TestOneID) as a means to look up data from TestTwo table. Then had I created the table with TestOneID as the first column, and made a combined PK of TestOneID and TestTwoID, I would have had an effective clustered index already, and wouldn’t have needed to make non-unique/new clustered as I did.

Now this might seem like a small thing, however remember that indexes takes up space. So by having just one proper clustered index, then if nothing else, you save space. Small “randomish” testing for me showed almost a doubling of space used for indexes in my test database when I had the PK non-clustered index and the custom clustered, compared to a proper clustered index. So if these findings are extrapolated to a working production database, it can be significant space.

As for the unique versus non-unique then as far as I’ve read, SQL Server internally only keeps unique indexes, so if you specify an index shouldn’t be unique, SQL Server will automatically append 4 bytes of data to the index to make sure it is unique. And seeing as 4 bytes matches an integer, then it looks to be that you yourself could enforce this uniqueness by having a unique integer value on your table. Following this logic, it could be thought out that if you know your table will not grow too large, then a smallint (2 bytes) should be able to save you some space. However, small random testing did not really show this to be true for me, so whether that is accurate or not – I can’t say with certaincy.

Also, because you also use extra space for keeping stuff in your table, then perhaps it goes to show that contrary to the school learning I received “back then”, that perhaps a primary key is not always the way to go. Primary keys should as far as I’m concerned only be used when needed, and then you can create the clustered indexes yourself.
But that is an entirely new discussion.

Bookmark and Share DotnetKicks dotnetshoutout

Comments

Add comment




  Country flag

biuquote
  • Comment
  • Preview
Loading



Powered by BlogEngine.NET 1.6.1.0
Theme by Mads Kristensen | Modified by Mooglegiant

About:
Allan Svelmøe Hansen

My real name is Allan Svelmøe Hansen.
I live in Denmark, where I work as a developer for hedal:kruse:brohus using SQL Server and the .NET framework since 2004.
My primary fields of expertise is back-end data integration, database design and optimization.


       View Allan Svelmøe Hansen's profile on LinkedIn     

Disclaimer

The opinions expressed herein are my own personal opinions and thoughts and does not represent my employers view in any way, nor are my results guaranteed for all situations.
Content is presented "as is", with no warranty.