simpletable and index

If i had a boat, i’d sail it on the ocean.  And if i had a pony, i’d heap it on that boat

If you have a table, and data therein, it is a heap.  If you have a non-clustered index on said table, it is a heap – with an index.  The index is a table, with one column(s) the index on the table, and the second column a pointer to the location of the data in the table.

If you have a clustered index, you no longer have a heap.  You have a table that is maintained in order of the index, which can be unique and which can be the primary key.  The CI (clustered index) is the table and the table, having a clustered index, is the clustered index.

If you have a query and use the Clustered index (or rather, the Query optimizer decides to create a query plan that uses the clustered index), you get the results of the query back as data from the table.  Assuming the query is on the column that the CI was created on, the QA will (certainly) use the CI and the records that satisfy the query will be quickly returned.  This will happen if we have select (*) or select (index column).  If we query with predicates that are NOT in the CI, there might not be any value in using the clustered index, and QA might well decide on a table scan to find the matching data.

Basically the same thing is true for a heap or CI table if the query does not involve the indexed column.  Use the index if it improves performance, but if the index is on Last Name and you are searching for everyone in a town, the CI which will maintain the order of the records, doesn’t help the present query.  Table Scan Time.  Query on last name, and the CI will very efficiently find the records with the last name, and having located them (hopefully in the data cache), return the data requested.

Assume a non-clustered index on last name.  Query on something other than last name and the index is of no value.  Query on last name, and we quickly locate the last name in the index, quickly find those locations in the table where the required records are found, find them and return the data.

What happens with a non-clustered index, a query on last name, and the only data wanted is the last name (or count lastname).  Once the index is consulted and the records identified, we don’t actually need to access the records since we have enough information to satisfy the query.

For this reason, it is useful to create non-clustered indexes on multiple columns – if we have a index that will satisfy common queries, we can get quick results without having to go to the data – the index informaiton is enough.  There is a cost in index maintenance which increases for multiple column indexes.  As well as more space used for storage.

We can also create covering indexes, which have non-indexed fields associated with the non-clustered index.  (Can’t be the case in a Clustered index, since ALL COLUMNS/DATA already are a part of the CI).  So we have a non-clustered index on firstname and lastname (does the order matter?) and maybe middle initial.  We cover the “dob” column in the index.  This means that we can rapidly find records based on names, and if we frequently are interested in the “dob”, this index will be useful for queries that involve details about the names and also names and date of birth.  If we want information on the other columns, the ‘dob’ column in the index will not help, and we need to pull the records after using the index to locate them.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s