startup and cache

All activity (CRUD: create, read, update, delete) in MS SQL is done in memory – RAM.

When MS SQL is started/restarted, there is nothing (no data) in memory.  When you submit a query, data is located on the dsik read into memory, and used for the query, perhaps with changes or perhaps with a delete.  At some point the data in memory is pushed back to the disk, but the data in memory is kept there (cached) forever.  Forever is defined as until the server is restarted, or until the engine determines that there is other data those inclusion in cache is more valuable (related to the cost of reading it in again and again) that our first query.  Then it (or some other cached and underused data) will be dumped.

Data as records, but also indexes and query plans are cached.

Data (records, indexes, logs) are stored in 8K pages.  Nothing less than a single page will be cached.  Tables and logs and indexes, as such, are not cached, only pages containing data.  All data on a single page belong to one object (there is no mixing of records/tables and indexes on a page even if the page is not full).

On a page, data that is from a heap is not stored in any particular order.  To find a records, we read or scan all data.  Finding a match does not mean that we can’t have more matches, so we need to read all the data in a heap (possibily multiple pages of data) before we are finished.

If data is stored with a clustered index (CI) it is stored in a particular order (which might or might not be ordered on an individual page, but the order on the page IS constrained by the xxx).  When we search using the Clustered index, we know where the first record that matches the query is found and we know when we have found everything that is in the index.  The clustered index is on some number of columns but includes ALL the data: the index is the table.

If there is a non-clustered index (NCI) on a heap or a clustered index table, there is created a copy of the table, containing a reduced number of columns and it is physically ordered based on the indexed columns.  Other columns might be included as part of the index or as included columns or a covered index.  There is a pointer to the location of each records in each index entry for the heap or the clustered index.  If a query can be answered by data only from the NCI, there is no need to go to the actual data.  This would not happen if we use select (*)…because only the actual data can have all the columns, OR we have an NCI with all the data, which is wasteful.

Again, the heap, the NCI and the CI objects are on separate pages.  If we have a NCI on a heap or on a CI, it has its own data pages

Use DMF to check for the number of pages and fragmentation

SELECT * FROM sys.dm_db_index_physical_stats
 (DB_ID(N'Equella'), OBJECT_ID(N'dbo.bookmark'), NULL, NULL , 'DETAILED');