locks and indexes

Klaus Aschenbrenner has a post where he creates two tables, and two sessions.  Session one updates table 1 and next selects from table 2.  Session two updates table 2 and selects from table 1.

Without an index on the tables, session one has to do a table scan on table 1; similarly session two and table 2.  The table scan creates an exclusive lock on one row in the tablde the table, and that prevents the select statement in session two select occurring because without an index, it has to read the entire table and ask for a shared lock  – session two can’t get the shared lock because of session one’s exclusive lock.

If there was an index on table 1, session two “should’ be able to find the right value to read – unless it is the  row being updated.  Similarly, session one has to do a table scan to find the correct record to select in Table 1, and can’t since it wants to read a row which is locked by session two.

In another post, he has two sessions running against the same table.  The first session, or first query, gets an update lock while looking for the correct row.  That lock is converted to an exclusive lock, the second query blocked, the data updated, the lock removed and the other query runs.  If the initial lock, for either/both queries was a shared lock, there would not be a block, however a shared lock can’t be converted to exclusive if there is another shared lock on that table\row.  If the first query immediately created an exclusive lock , you would block beguine select queries for the entire update transaction.  The update lock allows select queries, can be converted to exclusive at the right time, but makes the second update query wait until the exclusive lock is released since you can’t have two update locks at the same time.

order by needed

If you run a query against a table, and run it again, the output will probably be displayed the same way. This order is not guarenteed (without an order by clause). but face it, its going to happen. Again and again. Unless something changes – Kenneth Fisher wrote a blog about this in SQLServerCentral.

He creates a table, put data in it, and then a primary key on the locationID and a NCI on state+zip code.

A query on (state = ‘TX’) will return data. It happens to be ordered by state+zip code. On a system that doesn’t change, we can reasonably anticipate that this will be the same order for subsequent queries.

However, lets now insert a lot of “state = TX” data. The first query used a seek to get the 10-100 or so TX entries. The seek was using the NCI of State and ZIP – the output was ordered in the order of the index. If we insert enough TX to the table, the Query Optimizer decides that a clustered index scan will be more efficient, and the order of the output table, enforced by the clustered index, is by locationID.

Same query, same schema. Assuming “SET AUTO_UPDATE_STATISTICS ON” is set, then enough inserts will force a stats update, otherwise the Execution Plan changed.  Running the two queries in Show Plan mode will, in this case, show the change of plan. Inserting enough data MIGHT change the output returned, and also might change the order returned.

In our case, we are building tables that will be either used as reports (display matters) or as source for other queries – applications. In the second case, the order should probably be most efficiently handled by the application – less data, multiple queries, possible changes in order anyway. The exception would be when we are debugging or testing. We want to see items that are sorted or ordered together. We should consider removing the order by after that if not needed. Also note that (“select top 1000 *”) is quick. (“select top 1000 * … order by …”) is effectively (select ALL * … order by and FINALLY display first 1000).  Which is about the same speed as (“select *”) – which is slow.

Finally, even if our output table is in the “right order’, the QA will not know that unless there is an index that will indicate the order to the QA.  Angels, and QA, have no memory of how that intermediate table was generated, so doing the extra order by work without the ‘hinting’ index, is not useful.