I started to look more closely at the execution plans and how they worked and what I could use them for.
And while I’m still very much a novice within database optimizing and all that, I am learning, and one thing I’ve read so far is that scans are bad, seeks are good.
Scans means that SQL Server is – well scanning – all the datapages which either a table or an index consists off.  An index seek means that SQL Server could follow the index tree right down to the leaf level and pull the data from there…… at least that is my current understanding (remember, I’m  a novice)
Anyways, I made two tables to test a little.

(click to enlarge)
Database diagram for TestOne and TestTwo used in this example

TestOne has 10.001 rows of dummy data and TestTwo has 30.001 rows where generally each PK from TestOne is about 3 times in TestTwo.
The automatically created indexes are used first. This means a clustered index on the PK for TestOne and a clustered index on the PK for TestTwo.
Then I performed the very simple, yet realistic, query:

SELECT * FROM TestOne T1

INNER JOIN TestTwo T2 ON T1. TestOneID = T2.TestOneID

WHERE T1.TestOneID = 4567

(4567 is just a random number I used, could have used anything from 1 to 10.001
This gave the following estimated execution plan:

(click to enlarge)
Estimated Execution Plan 1

From that it is visible that the TestTwo gets an Index Scan which is because I’m looking for a value which isn’t directly visible from the clustered index on TestTwo, so SQL Server has to scan through the entire number of datapages to find the value(s) I’m looking for. On the other hand we can see that TestOne performs an Index Seek, which is because the key we are looking for is inside the Clustered Index, so SQL Server can look up the value down to the leaf node of the index.

Of interesting values can be said that the entire query has estimated cost of 0.56509, with the Clustered Index Seek (TestOne table) has an estimated CPU cost of 0.0001581 and estimated I/O cost of 0.003125.
TestTwo table, the Index Scan, has an estimated CPU cost of 0.0331581 and 0.514236 I/O cost.
(click to enlarge)
Estimated Cost for Index Scan on TestTwo
These numbers show clearly that it is the I/O cost of the Index Scan on TestTwo table which is the culprit and the main focus for optimizing this query.

Then I messed about a bit with the indexes on TestTwo, because that was the table which needed focus.
I changed the PK index to a non-clustered index, and created a new clustered Index for the columns of TestOneID and TestTwoID.
There isn’t much idea in putting TestTwoID into the clustered index, but I did it so I could enforce the unique constraint on the index myself without having SQL Server itself putting extra bytes upon each row. Space size it doesn’t matter much in my example, and it isn’t (wasn’t) my main focus anyway…

(click to enlarge)
Changed Index for TestTwo table

I then ran the exact same query once more and got this encouraging result

(click to enlarge)
Estimated Execution Plan 2
As can be seen the entire query now has estimated cost of 0.0065809, and my Index Scan has now become an Index Seek. (And gone from taking 97% of the total resources to 50%)

The other relevant numbers are I/O on my TestTwo seek, which now only cost an estimated 0.003125 and estimated CPU cost which is 0.0001603. Any way one can view it, those numbers are drastically lower then prior, and thus my index changes has had influences.

Of course there are costs associated with indexes, both in terms of size and the added overhead to updating/maintaining the indexes et al, so you can’t just stuff all you want into an index. So whether or not altered indexes are beneficial in a specific situation, is something which must be analyzed on a case by case basis. However this example does go to show that there is major benefits in the lookup though, and indexes are very important (of course) when it comes to databases… hence why I’m interested in them myself :)

Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>