Using the built-in System Health session

When Microsoft introduced Extended Events (XE) in 2008, they also gave us a built-in XE session called system_health (though it’s worth noting that in 2008 MS hadn’t yet provided us a GUI for this so it becomes most useful in 2012 and beyond).

This is a great little tool. I mainly use it for troubleshooting deadlocks as it logs all the information for any deadlocks that occur. No more having to mess about making sure specific trace flags are enabled to ensure deadlock information is captured in the error log.

It also captures the SQL text and Session Id (along with other relevant data) in a number of other scenarios you may need to troubleshoot:

  • Where an error over severity 20 is encountered
  • Where a session has waited on a latch for over 15 seconds
  • Where a session has waited on a lock for over 30 seconds
  • Sessions that have encountered other long waits (the threshold varies by wait type)

There are other events captured too, you can see the full list here:

https://docs.microsoft.com/en-us/sql/relational-databases/extended-events/use-the-system-health-session?view=sql-server-2017

You can find the system_health session here in SSMS under your server instance:

system_health1

Just double click on the event file to view historical data.

The view that comes up immediately can seem non-intuitive to work with (do I just have to scroll through thousands of events looking for the one I want?):

system_health2

If you know the type of event you are looking for though, you can right-click on the “name” column and select the option to group by the values in that column. Then you see something more like this:

system_health3

With the example of looking at deadlocks (what I mostly use this for) I can then just expand that group and look for the one I want.

Or you can right-click and use “Find in Column” – or “Choose Columns” to add extra columns you might want to search in. For instance, I might want to see if it’s captured any information about why my backups are being delayed so I can add the “sql_text” column, order by that and then search for “backup”:

system_health4

Once an event is selected it will show me the additional information gathered in the bottom pane:

system_health5

Like I say, it’s pretty useful. My only issue is that by default it captures only 20MB of data, which on a busy system can mean events are only kept for a couple of days. So, I often want to increase the retention. I find the easiest way to do that is to right-click on the session and select “Script as CREATE to New Query Window”. I then edit the script to change the number of roll-over files to 20 (from 4):

system_health6

You can then delete the existing system_health session and re-create it from the script – you do have to remember to right-click on it in the GUI and start it again.

One great thing is that when you do this is that you don’t lose the events already saved to file, as the files are retained and continue to be accessible from your newly created session.

All in all, a handy little tool.

Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

8 thoughts on “Using the built-in System Health session

    1. Hi, Extended Events was introduced in SQL 2008, so you will have it, but unfortunately there is no GUI for it even if you’re connecting from a current version of SSMS). You still have the system_health session, but it only appears to be using an in memory target so probably won’t hold a lot of data. You can view the data captured in XML form with the following query:
      SELECT
      s.name,
      t.target_name,
      CAST(t.target_data AS XML) AS [XE_Data]
      FROM sys.dm_xe_session_targets t
      INNER JOIN sys.dm_xe_sessions s
      ON s.address = t.event_session_address

      There’s probably a way to save that and be able to load it into the current GUI, but haven’t quite figured that out.

  1. Most of the entries in the xml_deadlock_report group are not deadlocks, i.e. no deadlock victim. I wonder why these entries are created and listed as deadlocks?

      1. These XML ‘deadlock’ entries typically show 4 processes each with the same spid but different ecids.

Leave a Reply