Simple Extended events for a SSMS process

SQL Server generates wait stats. If we have a SSMS session open, we can capture the events associated with activity in that session by pushing them into an extended event XML file with nodes like this

<event name=”wait_info” package=”sqlos” timestamp=”2015-01-27T18:59:17.727Z”>
<data name=”wait_type”>
<data name=”opcode”><value>1</value><text>End</text></data>
<data name=”duration”><value>4</value></data>
<data name=”signal_duration”><value>4</value></data>

First get the SPID of the session with

 select @@SPID.

Get any current event sessions with

SELECT * FROM sys.server_event_sessions

Create a session with a different name (or drop with the name you want if it shows up above)

ADD EVENT sqlos.wait_info
 (WHERE sqlserver.session_id = 53 
    /* session_id of connection from select @@SPID */)
ADD TARGET package0.asynchronous_file_target
 (SET FILENAME = N'C:\EE_WaitStats.xel',  
  METADATAFILE = N'C:\EE_WaitStats.xem')
WITH (max_dispatch_latency = 1 seconds);

start the session, run the query, end the session

— query here

Count the events

SELECT COUNT (*) FROM sys.fn_xe_file_target_read_file
 'C:\SQLskills\EE_WaitStats*.xem', null, null);
Output is 2xs the number of waits since there is an event on start wait and one on end wait.
The data is captured in the XML data files. Jonathan K has a script to pull out the data (note the wildcard in the name of theimport file)
CREATE TABLE #RawEventData (

– Read the file data into intermediate temp table
INSERT INTO #RawEventData (event_data)
 CAST (event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file (
 'C:\EE_WaitStats*.xem', null, null);

USE XQuery to pull the XML data from the temp file and display it on the display

waits.[Wait Type],
COUNT (*) AS [Wait Count],
SUM (waits.[Duration]) AS [Total Wait Time (ms)],
SUM (waits.[Duration]) – SUM (waits.[Signal Duration])
AS [Total Resource Wait Time (ms)],
SUM (waits.[Signal Duration]) AS [Total Signal Wait Time (ms)]FROM
(SELECT event_data.value (‘(/event/@timestamp)[1]’, ‘DATETIME’) AS [Time],
event_data.value (‘(/event/data[@name=”wait_type”]/text)[1]’, ‘VARCHAR(100)’) AS [Wait Type],
event_data.value (‘(/event/data[@name=”opcode”]/text)[1]’, ‘VARCHAR(100)’) AS [Op],
event_data.value (‘(/event/data[@name=”duration”]/value)[1]’, ‘BIGINT’) AS [Duration],
event_data.value (‘(/event/data[@name=”signal_duration”]/value)[1]’, ‘BIGINT’) AS [Signal Duration] FROM #RawEventData
) AS waits
WHERE waits.[op] = ‘End’
GROUP BY waits.[Wait Type]
ORDER BY [Total Wait Time (ms)] DESC
– Cleanup
DROP TABLE #RawEventData;

output looks like this

Wait Type                           Wait    Total       Total         Total Signal Wait
Count   Wait       Res Wait       Wait (ms)
———————                ———– ——————– —————————– —————————
LATCH_EX                        4919     15783       15693                 90
LATCH_SH                             18     4399         4398                  1
CXPACKET                             16     3039         3038                  1
PAGEIOLATCH_SH           813        468          457                 11
PAGELATCH_SH               246         113          109                  4
WRITELOG                          99           71          66                    5
PAGELATCH_UP              203           65          63                     2
LOGBUFFER                     123           58          57                     1
WRITE_COMPLETION     82           55          55                    0
PAGEIOLATCH_UP           35           27           27                    0
IO_COMPLETION             80           25          25                    0
SQLTRACE_LOCK               1            6              6                    0
PAGEIOLATCH_EX             4          3              3                    0
CMEMTHREAD                 17            1              0                     1
SOS_SCHEDULER_YIELD 6         0              0                     0

knee-jerk performance tuning

Paul Randal on Knee-jerk performance tuning

can be due to blocking (lack of NCI causing table-scans), or transactions taking time to complete, or mirroring preventing commitment. Generally, look for the thread at the head of the blocking chain.

NOT network oriented. PR suggests “WAITING_FOR_APP_ACK” SQL sends data to a client and is waiting for acknowledgement: for example SSMS is a client and if doing RBAR (row by agonizing row) analysis, will generate a lot of async_network_io waits.

Could be linked servers OR post 2005 DMVs Also DBCC CheckDB uses OLE DB rowset

usually see high CPU usage, but not always CPU pressure or spinlock.
Thread scheduling is non-preemptive (SQLOS, not windows OS).
Thread is running, suppended, runnable. A thread that needs a resouce is suppended and is on the waiter List (resource wait time), which is unordered. After being signaled, it moves to the Runnable queue until it is running (Signal wait time) with is FIFO.
A running thread that exhausts the 4ms quantum moves to the bottom of the runnable queue and that time is logged as SOS_SCHEDULER_YIELD. Not CPU pressure, just sustained activity where thread is long-running.
If a thread is using lots of different resources, soon it will generate a wait for resources or because something else is locking a resource. Therefore no SOS_SCHEDULER_YIELD. If you are doing a big table scan and have the table, you can’t finish it in 4 ms and do it over and over. Look for dropped NCI, out of date stats, bad query plan

PAGEIOLATCH_XX can be _SH (share): page of data needs to come from disk to be read (most common) or _EX/_UP for (exclusive or update): waiting for data from disk to be in buffer pool to be changed.
This is not always an I/O problem. If count of baseline waits OK, increasing duration could be I/O such as a) misconfiguration/malfunction at i/o sysbsystem, b) slow network, c) other i/o activy contention, d) replication/mirroring
PR: typically both number and duration increased from baseline. Often too many i/os – SQL issue not I/O. Logical I/O is request to Buffer pool for data. If in buffer, the Access Method gets pointer and reads, or else get a Physical I/O and the thread wait is seen as PATCHIOLATCH_SH.
If perfect, Physical I/O to get everything into memory and no more PATCHIOLATCH_SH, just writes to update dirty data.
sudden increase in reads means changes in workload and/or available bufferpool. Might be Windows OS making demand on memory, bloated plan cache, query plans with table/clustered index scan (not seek) because of more workload, parameter sniffing, change in NCI (dropped or changed), implicit conversion.
look for CXPACKET and PAGEIOLATCH_SHs together – large parallel table/CI scans
use sys.dm_os_waiting_tasks to see which query and query plan is operating when the waits occur.

When we need data that is not in memory, we read it from the disk. How long it stays in memory, in seconds, is the PLE. Or the pressure on the buffer pool to make free space for new data.
PLE 300 means on the average, everything changes ever 5 minutes. OK for low memory, should be higher for more RAM.
Johathan K suggestion: ( Buffer pool memory in GB / 4 ) x 300
PR: measure PLE when running correctly
DBCC CHECKDB and index rebuilds drop PLE. Memory for CHECKDB stolen from buffer pool and can be excessive.
monitor PLE in Buffer Manager, but not good for NUMA architecture. NUMA causes a separate pool for each NUMA node (so use Buffer Node:Page life expectancy counters) , and BM PLE is an average of all the nodes (Buffer Manager:Page life expectancy counter.)
large (new) query will cause drop in PLE
changes in PLE common due to explained things; if cause not known:
1) check sys.dm_os_waiting_tasks for task waiting on PAGEIOLATCH_SH
2) check sys.dm_exec_query_stats for queries with lots of physical reads
3) check timing vs. scheduled tasks/SQL Agent jobs
4) check sys.dm_exec_query_memory_grants for queries with large execution memory grants.

Just Add an SSD
SSD do not have I/O latency (move head, wait for disk)
1) alway put logs and tempdb on SSD IFF they are the bottleneck
but if your environment is read-heavy, t-logs not bottleneck, tempdb might or might not be heavily used
in these cases, the additional speed does not speed up processing
use sys.dm_io_virtual_file_stats DMV to find i/o bottlenecks. There are numerous reasons for slow data file reads and log file write latencies.
2) SSDs are fast, don’t sweat index fragmentation. But a) getting to a fragmented index faster does not mitigate the badness of the index problem*, b) fragmented indexs have lots of reclaimable space which is being wasted on expensive media, and c) Jonathan K ( found an observable penalty,
3) SSDs should still be RAID-1 (pairs) . PUtting two ‘volumes’ on one physical SSD is not a solution.
4) SSD do fail. consumer grade SSD MTBF of 2 million hours, vs. consumer spinning disks of 1.5MH
5) comments that SSD fast for squential writes, 1/10 as fast for sequential reads (still 10x faster than hard disks).
6) comment that you get MORE deadlocks on SSDs because reads get closer together, especially under load.

* fragmentation caused by “page splits from pages needing free space for a random insert or row size increase. A page split generates the same amount of transaction log, resource usage, and potential thread waits regardless of where the data/log files are stored.” (PR)