rebuild indexes

We wanted tor rebuild the indexes on a database on one of our clustered db servers.  The options were to use a management plan or manually use the SSMS GUI to rebuild or run it from the query window.

To determine how many indexes we had, this query was run

SELECT * FROM sys.dm_db_index_physical_stats (6, NULL, NULL, NULL, NULL);
GO

where 6 was the db_id for the database.  That query  returned 264 items, like thhis:

database_id
object_id
index_id
partition_number
index_type_desc
alloc_unit_type_desc
index_depth
index_level
avg_fragmentation_in_percent
fragment_count
avg_fragment_size_in_pages
page_count
avg_page_space_used_in_percent
record_count
ghost_record_count
version_ghost_record_count
min_record_size_in_bytes
max_record_size_in_bytes 
avg_record_size_in_bytes 
forwarded_record_count 
compressed_page_count 6 5575058 2 1 NONCLUSTERED INDEX IN_ROW_DATA 2 0 90 10 1 10 NULL NULL NULL NULL NULL NULL NULL NULL NULL
6 30623152 1 1 CLUSTERED INDEX IN_ROW_DATA 2 0 3.92156862745098 9 5.66666666666667 51 NULL NULL NULL NULL NULL NULL NULL NULL NULL

A smaller query on a different database is this and returns the following 

SELECT database_id, object_id, index_type_desc, avg_fragmentation_in_percent 
FROM sys.dm_db_index_physical_stats (5, NULL, NULL, NULL, NULL);
GO database_id object_id index_type_desc avg_fragmentation_in_percent
5 21575115 HEAP 0
5 53575229 CLUSTERED INDEX 99.8394863563403
5 53575229 NONCLUSTERED INDEX 99.6
5 85575343 CLUSTERED INDEX 0
5 85575343 NONCLUSTERED INDEX 0
5 117575457 CLUSTERED INDEX 0
5 2073058421 CLUSTERED INDEX 0
5 2105058535 CLUSTERED INDEX 99.1930948480329

Fine.  We got data to compare afterwards

To rebuild the indices, run this t-sql statement

use MY_DB
EXEC sp_MSForEachTable ‘Print ”Rebuild index on: ?”; ALTER INDEX ALL ON ? REBUILD;’

This does a foreach over each table in the database MY_DB and executes ALTER INDEX ALL on TABLENAME REBUILD

On the target database, the query was started.  The server was enterprise so the query could run on-line.  However after 30 minutes, the application owner complained that the application (sharePoint) was effectively unreachable/very sluggish, so i stopped the query.  It took nearly one hour to rollback the query – which had gotten through about 90 or so tables. Note that many tables had multiple indexes.  

The CPU never got much above 5% but the locking of the tables and the rebuilding of the indexes pretty much made everything not work.  

This query

USE myq1_site_pair
GO
SELECT so.name AS TableName, si.name AS IndexName, si.type_desc AS IndexType
FROM sys.indexes si
JOIN sys.objects so ON si.[object_id] = so.[object_id]
WHERE so.type = 'U' --Only get indexes for User Created Tables
AND si.name IS NOT NULL
ORDER BY so.name, si.type

returned 229 entries that looked like this:

TableName IndexName IndexType
AllDocs AllDocs_ParentId CLUSTERED
AllDocs Docs_IdLevelUnique NONCLUSTERED
AllDocs AllDocs_Url NONCLUSTERED
AllDocStreams AllDocStreams_CI CLUSTERED
AllDocStreams AllDocStreams_RbsId NONCLUSTERED

from this, we can do the following in a script

USE MyQ1_Site_Pair;
GO
ALTER INDEX WorkflowAssociation ON dbo.WorkflowAssociation_Parent
REBUILD;
GO

 The typical statement is to rebuild indexes over 30% fragmentation and if more than 5% (less than 30%) reorganize it.  The syntax is identical, the different in the alter statement is “reorganize” and not “rebuild”

no more tables

Forgetaboutit – when we talk about data, don’t think of it being in tables.  Instead, think of Heaps, Clustered Indexes and Non-clustered indexes.  A table without a CI is heap, a table with a clustered index is … a clustered index, and an index on either a heap or a non-clustered index on a clustered index is a non-clustered index.  All clear?

All of these are database objects and are stored in pages of data, probably in series of 8 pages (of 8K each) called extents.

Finding data in a heap is a table scan (why not a heap scan?), finding data in a clustered index, assuming we have a where clause that references the index is an index lookup or an index scan.  If we want something that is not in the index, having the index does not help us find the item, so back to table scan. Finally, if we find the item in a clustered index, there we are.  If we find the item in an non-clustered index, AND the data we want is not in the index or in a covering index, we will get a reference to the location in the table (…sorry…) and we can look it up.  However, if our index contains that data we want, we can just search the index to find our information and NOT have to go to the original NCI or Heap.  Faster (and possibly less I/O) that way.

table variables and indexes

MSSQLTips has an article comparing table variables and temporary tables and their performance.  The article is interesting for the conclusion and for the methodology (use profiler).

The findings are that for large data sets, temp tables are better.  For smaller data, not much difference.  However there is a difference because you can put a non-clustered index on a temp table and create statistics, so when columns involving the index are involved 1) selects on the temp table are faster and 2) updates, deletes and inserts can be SLOWER – due to the second action needed to maintain the index entry.

These tests were run using SQL Profiler and the trace output had the text of the query, and columns for CPU, Reads, Writes and duration.  The columns displayed (in some cases such as select only text and CPU were deemed relevant).

backup and restore with move

There is a database on a SQL server.  Duh.

They need a copy of it to do some testing.  Initially we were going to backup and restore it.

Segue: we don’t backup.  We use DPM.  But all my DBA-Heros say “use the CLI, Luke!”  In fact, even using the GUI is thought declasse.

So i need to backup the file and restore it.  But of course i can’t restore it since it is already there.  OK, so i restore it with a different name.  I have to supply the name on restore anyway.  So

use master
go
backup database housingDirector_live
to disk = 'E:\housingDirector_live.bak'

Put the backup in the root of data (E:\) so i don’t have to go searching for it and it is shorter to type.  It is a file and not a device, so i have to supply the path as well as the name.

restore database housingDirector_test from disk = 'E:\housingDirector_live.bak'
Msg 1834, Level 16, State 1, Line 1
The file 'R:\MSSQL11.MSSQLSERVER\MSSQL\DATA\housingDirector_live.mdf' cannot be overwritten. It is being used by database 'housingDirector_live.'.
Msg 3156, Level 16, State 4, Line 1
File 'AdventureWorks2012_Data' cannot be restored to 'R:\MSSQL11.MSSQLSERVER\MSSQL\DATA\housingDirector..mdf'. Use WITH MOVE to identify a valid location for the file.
Msg 1834, Level 16, State 1, Line 1
The file 'R:\MSSQL11.MSSQLSERVER\MSSQL\DATA\housingDirector_live._log.ldf' cannot be overwritten. It is being used by database 'housingDirector_live..
Msg 3156, Level 16, State 4, Line 1
File 'AdventureWorks2012_Log' cannot be restored to 'R:\MSSQL11.MSSQLSERVER\MSSQL\DATA\housingDirector_live._log.ldf'. Use WITH MOVE to identify a valid location for the file.
Msg 3119, Level 16, State 1, Line 1
Problems were identified while planning for the RESTORE statement. Previous messages provide details.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.

Huh?  I changed the name…..  I hate it when something “terminates abnormally.”  That’s just not … RIGHT.

Using restore filelistonly to the backup file says that the backup knows about the correct location to restore the files even if the database file name is different.

restore filelistonly from disk = 'E:\housingDirector_live.bak'
LogicalName              PhysicalName                               Type FileGroupName Size
HousingDirector_Live     F:\MSSQL\Data\HousingDirector_Live.mdf     D    PRIMARY       2283470848
HousingDirector_Live_log E:\MSSQL\Data\HousingDirector_Live_log.ldf L    NULL          722468864

So use move to either move the files to another location or rename the files after they are extracted.

(we can get the same file paths from the original locations more simply with t-SQL and this query;)

SELECT 
  DB_NAME([database_id]) [database_name]
, [file_id]
, [type_desc] [file_type]
, [name] [logical_name]
, [physical_name]
FROM sys.[master_files]
WHERE [database_id] IN (DB_ID('HousingDirector_Live'),DB_ID('HousingDirector_Live_Log'))ORDER BY [type], DB_NAME([database_id]);

That should give us something like

database_name file_id file_type logical_name physical_name
HousingDirector_live 1 ROWS HousingDirector_Live C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\HousingDirector_Live.mdf
HousingDirector_live_log 2 LOG HousingDirector_Live_Log C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\HousingDirector_Live_Log.ldf

So, next restore again with Move and the new name.

restore database housingDirector_Test 
from disk = 'E:\housingDirector_live.bak'
with move 'HousingDirector_Live' 
to 'F:\MSSQL\Data\HousingDirector_Test.mdf',
move 'HousingDirector_Live_log' 
to 'E:\MSSQL\Data\HousingDirector_Test_log.ldf'
Processed 278704 pages for database 'housingDirector_Test', file 'HousingDirector_Live' on file 1.
Processed 5 pages for database 'housingDirector_Test', file 'HousingDirector_Live_log' on file 1.
RESTORE DATABASE successfully processed 278709 pages in 35.840 seconds (60.753 MB/sec).

More on MOVE.