Extended Events Made Easy – Sorting, Grouping and Aggregation

When I’m using Profiler to analyse performance issues I often save the results to a table, or upload a trace file into a table, so that I can analyse the data. Often this involves aggregating the values for particular queries so that I can see the most resource hungry.

This is by no means a difficult process, but with Extended Events (XE) it’s arguably even easier.

Let’s start with a session from XEvent Profiler like we saw in my last post on this subject:

Extended Events Made Easy: Using XEvent Profiler

I’ve set a workload running and captured the Live Data. The workload consists of some arbitrary queries I’ve written, each running in a loop in its own query window.

After a few minutes I stop the session so I can do my analysis:

XESortingetc1

I’ve got a couple of hundred events. Let’s say I want to quickly find the longest running individual query. In the Live Data viewer you can just right-click on the column name and select sort ascending or descending. This is pretty obvious stuff, but I thought it was worth highlighting as you can’t do the same in Profiler – there you have to save to a table first then query the table.

So, I’ve sorted descending on Duration:

XESortingetc2

You can see the biggest values shown are the aggregated values for the existing_connection and logout events. In retrospect it might have been better to create an XE session that didn’t include these events. However, I look down the list and quickly find my longest running query (highlighted in blue).

The duration specified – 33,161,516 – is in microseconds. So that is just over 30 seconds.

Of course, now I’m thinking – that’s my longest running, but what I really want is the one that has the most CPU – and not just from one execution but the total from all the executions that happened in my sample time frame.

This is where I can use grouping and aggregation. I can right-click on any of the columns and select “Group by this column”. I’m going to do that for the TextData column:

XESortingetc3

We’re now seeing all the events from the sample time period, grouped together based on TextData. The number on the right of each one is the count of events in that group.

Now I can add some aggregations. After grouping I still can see the original column headings at the top, so I just right-click on cpu_time, select “Calculate Aggregation” and select “SUM”:

XESortingetc4

You’ll see you can pick the other standard aggregations also e.g. MIN, MAX, AVG etc.

You can have multiple aggregations going on, so I’ll do the same for duration and logical_reads.

Once I have my aggregations in place, I’ll want to sort the results. It’s the total cpu_time I’m most interested in so I right-click on that column and now I have the option to “Sort Aggregation Descending”:

XESortingetc5

After those steps this is what I end up with (I’ll just show the top few rows):

XESortingetc6

Admittedly this isn’t the most beautiful of views, but it was quick and easy to get to and it gives me the information I need. I can see my most CPU hungry queries and I can make a plan of action.

Incidentally, the top query was:

UPDATE dbo.Test SET SomeText = 's' + SomeText WHERE SomeText LIKE 's%'

If this wasn’t a totally arbitrary set of queries for demo purpose I might be considering putting an index on the SomeText column.

That’s all I wanted to cover in this post. Hopefully you can see that XE data is really easy to work with, you don’t have to get involved in XML querying and you can perform quick analysis without even having to write SQL – much as we all love writing SQL.

In the last couple of posts, I’ve looked at how easy it is to get started using XE with the XEvent Profiler. For the next post in this series I aim to demonstrate that it’s also really easy to create your own customizable Extended Events sessions.

Previous posts in this series:

There’s Still a Place for SQL Server Profiler

Using the built-in System Health session

Exporting Extended Events Session Data to a Table

Extended Events Made Easy: Using XEvent Profiler

T-SQL Tuesday 107 – Viral Demo

tsql2sday150x150

For this month’s T-SQL Tuesday. Jeff Mlakar invites to talk about “a project you worked on or were impacted by that went horribly wrong”:

https://www.mlakartechtalk.com/t-sql-tuesday-107-invitation-death-march/

I’m not sure the story I’m about to share is exactly what Jeff was looking for, but when you say failed project, this one usually comes to mind.

This is from one of my first roles as a developer, back when I was writing VB code and before I got into SQL Server. This was even before .NET. At the time I was working for an engineering company that dabbled in a few pieces of very specific software for the industry. I’d taken the job so I could develop my coding skills by working with a proper software development team. Only to discover when I started, that I was the development team – the others had left.

So… I worked on this piece of software. It was one of those productivity tools that had a very specific use in a very specific type of project. There was nothing else that did what it did in Windows. Without it you were required to do a lot of hand-cranked analysis using Fortran, or if you were really unfortunate, paper and a pencil. The licences sold for big bucks per seat – but there were only about 20 customers worldwide for whom it might be useful.

I’d been working on a shiny new version. As part of that, I was tasked with creating a demo that we could send to all those clients. The main thing was that it had to have full functionality but be limited to working with a sample dataset. The company were currently in litigation with one client who they’d given a copy for eval, but who’d then used it for completing a big piece of work within a six figure contract. They didn’t want a repeat of that.

My boss wanted to create an auto-run menu for the CD – we’d only just moved over to CDs from having to create 10 floppy disks for each install. An auto-run was the latest and greatest pinnacle of slickness. I gave him a few lessons in creating forms and buttons in VB and off he went.

My boss burnt the CDs one Friday, including his fancy menu executable. He packaged them up with the relevant sales brochures and sent them out. He then went off on holiday for a fortnight.

I came in on the Monday morning to an urgent email from one of our senior managers. Apparently one of the recipients of the CD was reporting that they had put it in a desktop and that it had installed a virus on their system. Testing verified that they were correct. The auto-run menu program did indeed have a fairly nasty virus hiding in the executable, and because it was in the auto-run, all they needed to do to get infected was put the CD in their drive.

Basically, my boss used to use his work PC for downloading hacked “warez” from happyhippo.com. His work desktop had become badly infected and that had spread to the executable he worked on and compiled.  As he wasn’t around to own the issue, a colleague and myself had the fun of calling all the people we’d sent the demo to, asking them if they’d put the CD in any of their computers. If they hadn’t we told them to destroy it. If they had, then we had to talk them through the lengthy process to get rid of the virus.

As a marketing exercise this wasn’t a stunning success.

I often tell friends in the software business this story as a consolation when they feel responsible for some mess up. “Well, at least you didn’t send a virus infected demo to your entire prospective customer base like my old boss did”

I left that company not longer after.

Checking for Membership of a Specific Active Directory Group

As part of my job I manage a bunch of SQL instances for Development and Test.

Access is managed though Active Directory groups, so I rarely have to do anything regards managing permissions. Nonetheless I often get requests from people to give them access. This is usually for a new starter or someone who has moved from one team to another.

Of course, the answer is usually that they just need adding to the right AD group. Rather than assume though, I always get them to check before I pass the request on to the AD team. You never know, there could be something else wrong.

T-SQL has a lovely little function for this, IS_MEMBER. For instance, If I want to know if I’m a member of MyDomain\SQLAdmins I just run:

SELECT IS_MEMBER('MyDomain\SQLAdmins');

If it returns 1 then I am a member. Zero then I am not. Null means it can’t find the group, probably because I’ve spelt it wrong.

So, I can get the requester to check for themselves if they are a member of the group, and then we can raise the request to get them added.

IS_MEMBER is also useful if you want to check if you are a member of a specific database role – either one of the built-in ones or a user-defined one e.g.

SELECT IS_MEMBER('db_owner');

I’m not sure how well known this is and I just had one of those requests, so I thought I’d create a quick post on the topic.

Extended Events Made Easy: Using XEvent Profiler

In this post we’ll look at how quick and easy it is to use the XEvent Profiler to create an Extended Events (XE) session to replace the most common usage of SQL Server Profiler.

Before you start, update your SSMS to a recent version (17.3) or later:

https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-2017

Now, let me ask you the question – how often do you just go into Profiler and create a trace taking all the defaults? If you’re like me then that’s pretty often. It’s nice and easy, select the menu item from SSMS and you’re going in a couple of clicks.

All XEvent Profiler does is implement a couple of standard XE sessions very similar to Profiler, that you can launch with a click or two.

You can find XE Profiler in the object explorer in SSMS under each of your SQL instances – just below SQL Server Agent:

XE_1

Because it’s there for each instance, you don’t have to then connect to the instance separately (as you do in Profiler).

You’ve got two options, Standard which pretty much equates to the default trace in Profiler, or TSQL which will just capture SQL statements being executed. Let’s right-click on Standard and select “Launch Session”. Here’s the Live Data view that comes up:

XE_2

You can see that looks pretty similar to what you see for your traditional traces.

Now, one thing I almost always do for this sort of trace in Profiler is capture the database name. This is a bit of a pain, because I have to select “View all columns”, scroll along to find the database name column, then select it for each relevant event I’m capturing.

In XE it’s much easier. I just right click on one of the column headers in the above view and select “Choose Columns”:

XE_3

Which brings up the menu of columns available to me:

XE_4

I can then just select the database_name column in the left and use the arrow button to push it into the list of selected columns. Want row_count? That’s there too, along with a whole bunch of other stuff. I’ll add those and remove a few I don’t want and exit the dialog.

The other thing I’m commonly going to want to do is add a filter to one or more of the columns. When you’re viewing live data you get the Extended Events toolbar shown in SSMS below the menu bar, just click on Filters…

XE_5

And you get the filters dialog:

XE_6

Click below “Field” to select a column and then add a value to filter on. In this case I’m going to filter to look at a specific database.

It’s worth noting that there are a bunch of operators you can use (not just equals), you can do greater than, less than, contains, does not contain, is null etc.

Now we can see the effect of those changes in the Live Data view:

XE_7

The above is what I’d usually do in Profiler, except that here I’ve also added an extra column and removed a few – because it was so easy to do. The whole thing felt like a simpler and cleaner experience than the old way.

But I get it, a marginally simpler experience probably isn’t enough to make you change habits. What might, is what you can do with your trace results once you’ve got them. In the next post we’ll look at sorting, grouping and aggregation.

Previous posts in this series:

There’s Still a Place for SQL Server Profiler

Using the built-in System Health session

Exporting Extended Events Session Data to a Table

 

Table Variable Performance “Fixed” in SQL 2019

Reading the new features for SQL 2019 I spotted this:TableVariables2019_1

One of the most popular posts on my blog last year was where I pretty much suggested that people not use table variables:

Think twice before using table variables

This wasn’t new information when I wrote it, but bad performance due to the use of table variables remained such a common anti-pattern that I thought it was worth stressing again.

So, when I saw the above 2019 feature I thought I’d better investigate and update what I’m telling people.

TL;DR It looks like table variables are no longer a problem.

I’m going to re-use my examples from the previous post (as well as some of the images captured). One thing to note though it that I’m running these tests on a different, much less performant box than I did last time, so execution times will be longer.

My first test last time was just to show how cardinality estimation was better for a temp table vs. a table variable. This time I’m just going to compare a table variable running under compatibility mode 140 (SQL 2017) with one running under compatibility mode 150 (SQL 2019).

I set the compatibility mode as follows:

ALTER DATABASE AdventureWorks2012 SET COMPATIBILITY_LEVEL = 140;

Then I run the first query:

DECLARE @a TABLE(i INT);
 
--Create a million rows
WITH Nums(i) AS
(
       SELECT 1
       FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n(i)
)
INSERT INTO @a(i)
SELECT n1.i
FROM Nums n1 
CROSS JOIN Nums n2 
CROSS JOIN Nums n3 
CROSS JOIN Nums n4 
CROSS JOIN Nums n5 
CROSS JOIN Nums n6;
 
SELECT i FROM @a;

The execution plan for that final select looks like this:
Variables1

And when I hover over the Table Scan operator I can see the properties:

Variables2

You can see that the estimated number of rows is calculated as 1 – but the actual number of rows was 1 million. This is (was) the problem with table variables.

Now I’ll change the compatibility level:

ALTER DATABASE AdventureWorks2012 SET COMPATIBILITY_LEVEL = 150;

And run the query again.

I get the same execution plan, so let’s just look at the properties of the Table Scan operator this time:
TableVariables2019_2
Here you can see that the estimation is correct, 1 million. This is looking good so far for the future of table variables.

In the last post I then looked at a case where the bad estimation caused a poor selection of execution plan so let’s repeat that test. First, I set my compatibility level back to 140, then I run the following (if you want to repeat, you need the AdventureWorks2012 database):

DECLARE @BusinessEntityId TABLE (BusinessEntityID INT NOT NULL PRIMARY KEY CLUSTERED);
 
INSERT INTO @BusinessEntityId
SELECT BusinessEntityID FROM Person.Person;
 
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
 
SELECT COUNT(*)
FROM @BusinessEntityId b
INNER JOIN Person.Person p
       ON b.BusinessEntityID = p.BusinessEntityID;
 
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;

Let’s look at the execution plan for the final query:
Variables5
Now let’s look at the properties for the clustered index scan:
Variables6
We can see that estimated number of rows is just 1, but that the actual number of rows is 19,972.

If we then look at the properties for the Index seek operator:
Variables7
We can see that the estimated number of executions was 1, but the actual number of executions was 19,972. That’s 19,972 seeks into a table with 19,972 rows. As previously discussed – there’s got to be a better way. A Nested Loops join is generally best when there’s a small number of rows from the top table, and a larger amount of rows from the bottom. In this place they both have 19,972 so it’s not optimal.

Here’s the output of the statistics commands for that query:

Table ‘Person’. Scan count 0, logical reads 59916, physical reads 3810, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘#B5D4D4C8’. Scan count 1, logical reads 35, physical reads 29, read-ahead reads 28, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 30951 ms.

Now we’ll change the compatibility level to 150 and try again. Here’s the execution plan we get this time:
TableVariables2019_3
You can see the join operator is now a Hash Match which is generally a much better choice for evenly sized tables. Let’s look at the properties of the Clustered Index Scan in this new plan:
TableVariables2019_4
This time the estimated and actual values match, which has enabled SQL to make the better choice of plan. We can see that if we look at the statistics output for this one:

Table ‘Workfile’. Scan count 4, logical reads 64, physical reads 0, read-ahead reads 64, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Person’. Scan count 1, logical reads 67, physical reads 1, read-ahead reads 65, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘#B9A565AC’. Scan count 1, logical reads 35, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 25220 ms.

You can see the CPU and reads are massively reduced.

So, it seems table variables aren’t so bad anymore. At least they won’t be once you are running SQL Server 2019.

I should probably add some caveats at this point. This has been a limited test, so you yourselves should make sure you test your own performance if you start using table variables instead of temp tables. There are also going to be deferred compilations when you do this in stored procedures. I haven’t dug down into exactly how that will work with table variables, but I guess it’s going to be similar to the way temp tables work in existing SQL versions.

Don’t Just Rely on Query Execution Stats for T-SQL Execution

I recently had an incident where I was looking into the cause of a long running process for a client. It was a batch process that ran overnight, and execution time had been growing until it was now taking over 4 hours.

The database involved is a kind of staging area for MI, so I wanted to look at any heavy queries being executed against the database – I knew that these would all be part of the batch process. I asked them to run the following query for me:

SELECT TOP 50
   DB_NAME(qt.dbid) AS DatabaseName, 
   qt.TEXT, 
   qs.execution_count,
   qs.total_worker_time,
   qs.total_physical_reads,
   qs.total_logical_writes,
   qs.total_logical_reads,
   qs.total_elapsed_time
FROM sys.dm_exec_query_stats qs
CROSS apply sys.dm_exec_sql_text(qs.sql_handle) qt
WHERE ISNULL(qt.dbid,0) = DB_ID();

I got less than 50 rows back so figured I had everything covered, but the total elapsed time across all the queries was less than ten minutes, I knew the server hadn’t been rebooted for about a month so potentially that could be as little as 20 seconds of query execution a night. Even if the full 10 minutes was from the last 24 hours that still didn’t account for the long run times.

So, I convinced myself it wasn’t the queries and started looking at other things they could check. It was a long list. Lots of theories revolved around the idea that something else was running on the host at the same time.

As part of that I decided to get them to run an extended events trace so I could look at exact timings and waits during the batch process (and anything else that might be going on in SQL). I got a lot of data back which took a fair while to index and load. Figuring I’d already ruled out query execution times for the batch process itself, I expected the analysis was going to be tricky to drill down on the problem, however as soon as I looked for heavy queries running during the execution window I found one specific stored procedure with the following metrics:

QueryExStats1

Ouch.

Thing is, these figures were for a stored procedure that was part of the batch run. I date for this in the original output but the figures for query execution duration and CPU were tiny in comparison to these. So, were the query execution stats wrong?

No, they weren’t, dm_exec_query_stats only captures figures for cached query plans, so you might have a query that’s not caching a plan for whatever reason – or might have something that’s doing work that’s not actually a query, imagine a while loop that’s trying to find all the prime numbers under a trillion without using sets or queries. That would consume a lot of CPU and duration but wouldn’t show in in the query execution stats – because it’s not a query.

In this case it turned out the stored procedure was:

  • Trying to process over 2 million rows using a cursor
  • For each row it would attempt to execute a stored procedure that didn’t exist using dynamic SQL
  • It would catch the failure and update an error message against the row – using the cursor
  • Every day it would attempt to process the failed rows again
  • Every day more rows would be inserted to the table that would fail

In particular it was the use of the cursor that was most expensive, and not being captured in the execution stats.

To be kind to the original developer, it was a clever generic solution to a specific problem. It wasn’t optimal from a performance point of view, but they only expected it to have to deal with a few rows daily.

Then the volume of data increased.

Then a subsequent developer made a change but didn’t know they needed to add a new supporting stored procedure. The resulting issue didn’t get picked up in testing.

The number of failed rows had been growing every night for a couple of years, and the execution time creeping up until it went exponential.

This post isn’t about bad code though, it’s about remembering that not all resource consumption on our SQL boxes comes from cached queries, so when trying to troubleshoot remember that query execution stats are one place to look, but don’t cover everything that might be being executed in T-SQL.

T-SQL Tuesday #106 – The Trouble with Triggers

tsql2sday150x150

For this month’s T-SQL Tuesday Steve Jones asks us to talk about our experiences with Triggers:

https://voiceofthedba.com/2018/09/03/t-sql-tuesday-106-trigger-headaches-or-happiness/

In my post I’m going to focus on DML (Data Modification Language) triggers. Their counterparts, DDL (Data Definition Language) triggers have any number of interesting applications. It’s usually DML triggers however that attract the most controversy.

DML triggers seem like a great functionality when you first discover them in your SQL career, especially if you come from an application coding background. Event driven programming for data – sounds brilliant. If someone does something to table X, have a trigger than in turn does something to tables Y and Z, you can even have triggers on Y and Z and cascade actions to other tables and beyond.

Once you’ve been around for a while though you realise what a headache this can end up being. You think you’re just inserting a record into table X but there’s this whole bunch of stuff going on behind the scenes that you may not know about – or want.

This is the biggest objection to triggers. They hide logic. If I have table called FeeTransaction and there is a bunch of logic that MUST happen when I insert a record here, then that’s clearer if it’s all encapsulated in a stored procedure that inserts the record. That makes it easier to maintain and troubleshoot rather than hunting around to find independent events that might be kicking off all over the place.

The only place I’m really a fan of DML triggers is in capturing Audit information. Most commonly for maintaining basic things like the last updated date and time for a record. I like having a mechanism that means I can see when data was last changed – whatever method was used to change it. i.e. if it was through the application, or if someone with write-access just ran an update query. Then I don’t need to rely on that person remembering to alter the LastUpdated column when they made their change.

The need to maintain this sort of basic audit info is so ubiquitous that it seems silly it is not a core feature of SQL Server, i.e. the ability to add a column to a table that will always automatically capture the time a record was modified, presumably this could also be done in a tamper-proof manner. In the same way we could have the ability to capture the Login context used to make the change or potentially anything else we need to capture.

Adam Machanic has raised a request with Microsoft to include functionality, if you agree it would be handy then please upvote here:

https://feedback.azure.com/forums/908035-sql-server/suggestions/32901964-provide-a-mechanism-for-columns-to-automatically-u

Exporting Extended Events Session Data to a Table

If you’re a long time Profiler user like me then you probably often take the option of saving (or loading) your trace results to a table for easy analysis. Well, with Extended Events (XE) it’s easy to do that too.

Once you’ve opened the session to view the data you get an extra drop-down menu “Extended Events” on the menu bar in SSMS. If you open that and browse down, you can quickly find the option to export the results to a table:

xe_table1

Or to a CSV or XEL file if you wish.

You just need to select a destination database connection and table name and the export starts. Be warned that it doesn’t default to the current database connection. I’ve fallen for that and overwritten the data in a table with the same filename on a different SQL instance – whoops!

If the option is greyed out when you open the menu it may be that your event data is still loading. If you look closely in the above screenshot you can see I have over 8 million events captured by this session, so it took a while to load before I was able to export.

Once the export has finished you end up with a table that looks a bit like this (the exact set of columns will depend on which fields you have collected):

xe_table2

One gripe I have with this functionality, and that you can see above, is that all the text columns come out defined as nvarchar(max). Apart from anything else this means you can’t index the columns without changing the data type.

That’s one thing that Profiler did better, for instance database_name captured from Profiler would have been an nvarchar (128).

I thought I’d knock this post together while waiting for my statement to complete which alters database_name to the preferred nvarchar(128). With 8 million records that’s taken 21 minutes so far. And then I’ll have to index it.

Yes, maybe it’s my fault for capturing so many events, but in my defence, I’m trying to do some analysis in a dev environment to work out what I can filter out safely, before passing the same session to a client to use in production to diagnose the problem with a long running SSIS package.

Still, as you can see it’s an easy to process to export your Extended Events data to a table so you can do your analysis there – rather than having to resort to querying XML.

 

Using the built-in System Health session

When Microsoft introduced Extended Events (XE) in 2008, they also gave us a built-in XE session called system_health (though it’s worth noting that in 2008 MS hadn’t yet provided us a GUI for this so it becomes most useful in 2012 and beyond).

This is a great little tool. I mainly use it for troubleshooting deadlocks as it logs all the information for any deadlocks that occur. No more having to mess about making sure specific trace flags are enabled to ensure deadlock information is captured in the error log.

It also captures the SQL text and Session Id (along with other relevant data) in a number of other scenarios you may need to troubleshoot:

  • Where an error over severity 20 is encountered
  • Where a session has waited on a latch for over 15 seconds
  • Where a session has waited on a lock for over 30 seconds
  • Sessions that have encountered other long waits (the threshold varies by wait type)

There are other events captured too, you can see the full list here:

https://docs.microsoft.com/en-us/sql/relational-databases/extended-events/use-the-system-health-session?view=sql-server-2017

You can find the system_health session here in SSMS under your server instance:

system_health1

Just double click on the event file to view historical data.

The view that comes up immediately can seem non-intuitive to work with (do I just have to scroll through thousands of events looking for the one I want?):

system_health2

If you know the type of event you are looking for though, you can right-click on the “name” column and select the option to group by the values in that column. Then you see something more like this:

system_health3

With the example of looking at deadlocks (what I mostly use this for) I can then just expand that group and look for the one I want.

Or you can right-click and use “Find in Column” – or “Choose Columns” to add extra columns you might want to search in. For instance, I might want to see if it’s captured any information about why my backups are being delayed so I can add the “sql_text” column, order by that and then search for “backup”:

system_health4

Once an event is selected it will show me the additional information gathered in the bottom pane:

system_health5

Like I say, it’s pretty useful. My only issue is that by default it captures only 20MB of data, which on a busy system can mean events are only kept for a couple of days. So, I often want to increase the retention. I find the easiest way to do that is to right-click on the session and select “Script as CREATE to New Query Window”. I then edit the script to change the number of roll-over files to 20 (from 4):

system_health6

You can then delete the existing system_health session and re-create it from the script – you do have to remember to right-click on it in the GUI and start it again.

One great thing is that when you do this is that you don’t lose the events already saved to file, as the files are retained and continue to be accessible from your newly created session.

All in all, a handy little tool.

There’s Still a Place for SQL Server Profiler

Follow a few of the SQL Family on Twitter and you’ll mostly see one view regards SQL Profiler, and it’s not generally friendly.

So much so that I’ve been tempted to buy this t-shirt for some trolling at the next SQL conference I attend:

I Heart SQL Profiler

 

At SQLBits earlier this year, Erin Stellato ran a great session “Kicking and Screaming: Replacing Profiler with Extended Events”. Erin’s run it a few times at various conferences and you can check it out on YouTube:

https://www.youtube.com/watch?v=HnqLGJuRgvw

As a stubborn continuing user of Profiler I attended because I liked the kicking and screaming bit, being willing to accept that might fairly accurately describe my stance.

Erin ended the session asking for a commitment from the audience that next time they used profiler – just put aside 15 minutes to try seeing what it’s like to use extended events instead.

I’d used Extended Events (XE) on occasion, but generally when it was something that couldn’t be done in Profiler. I also regularly make use of the built-in system_health Extended Events session, particularly for retrieving details of deadlocks.

But if I needed a quick trace for something – to profiler I go. In my defence, I’m not a production DBA, so I’m checking things on dev and test instances where performance isn’t critical. But even where performance is a concern you can define the trace in profiler and then export that to implement as a server-side trace where the performance impact is minimal.

As a side story here, profiler can slow things down as it needs to write events back to the app before each event can complete. A Microsoft consultant shared a story with me where they’d been called out to troubleshoot a client who was experiencing high CPU on the database server. The MS guy started a quick trace to see what was going on, and as that slowed queries down it actually reduced the CPU. The manager ran out yelling “What have you done – it’s fixed!” There had to be a fair amount of convincing to persuade the manager that leaving profiler running permanently wasn’t a solution to the CPU problems.

Back to the subject of the post. There are a couple of reasons why you may still want to use profiler (apart from where you’re running seriously old versions of SQL):

  • When you work with clients, it’s a technology they’re all familiar with. I can ask a client to run a trace and send me the results. I don’t have to explain in detail what to do so it’s easy to get what I need. If I asked for the results of an XE session I’d generally get the verbal equivalent of head scratching – they’ve heard of it (probably) but wouldn’t know how to go about it. They may also wish to go through a change process to implement one as they don’t understand well what the impact might be.
  • I want something I can replay. You can’t do this with extended events, but you can if you take the right sort of trace. Moving forward you can use “Distributed Replay” for this, but I don’t know anyone who’s done that.
  • You’re using Amazon Relational Database Services – with RDS you have restricted rights over your database instance, and the ability to create extended events sessions is something you don’t have access to.

There’s also a couple of reasons people say you shouldn’t use traces that I disagree with:

  • Performance. I’ve already mentioned this. Yes, XE is more lightweight, but if you implement your trace as a server side one then the impact is small.
  • Profiler is deprecated. Well, yes it is, but only from SQL 2016. The ability to not have to terminate your SQL statements with a semicolon has been deprecated since 2005 and we can still do that. And as Hugo Korenelius informs me, there are features that have been deprecated even longer and are still with us. Trust me, trace will be with us for a good while. I’ll happily take a wager on that if you don’t believe me. I might even wager that I’ll be retired before Profiler is.

Profiler_twitter

However, since making the pledge at Erin’s session I forced myself to try Extended Events for some of the more basic tasks I would have turned to Profiler for and I feel there are some reasons people (myself included) haven’t been using XE that are no longer valid:

  • It’s quicker/easier to set up a profiler trace. That was true in the early days of XE, but arguably now (depending on what you want to do) you can have your XE session going with remarkably few clicks.
  • XE produces XML output. I hate XML, I don’t want to start using XQuery to find what I want. Trust me, we all hate XML. With the GUI we now have for XE you don’t have to mess with that stuff.
  • With XE I can’t save my results to a table. Um, yes you can. A table, a csv file (with no XML I promise) or just copy and paste to Excel. And unlike profiler you can search the data without having to export it anywhere.

 

So yes, I believe that there is still a place for profiler, but no longer because it is quicker, simpler and more usable.

In subsequent posts I plan to show easy and convenient XE is to work with these days.

And yeah, okay it is more powerful, more lightweight, allows me to look at things profiler never could. People have been telling me that for years and it wasn’t enough to make me change – but “easier”? Give me that thing!

Still probably going to buy that t-shirt though…