Exporting Extended Events Session Data to a Table

If you’re a long time Profiler user like me then you probably often take the option of saving (or loading) your trace results to a table for easy analysis. Well, with Extended Events (XE) it’s easy to do that too.

Once you’ve opened the session to view the data you get an extra drop-down menu “Extended Events” on the menu bar in SSMS. If you open that and browse down, you can quickly find the option to export the results to a table:

xe_table1

Or to a CSV or XEL file if you wish.

You just need to select a destination database connection and table name and the export starts. Be warned that it doesn’t default to the current database connection. I’ve fallen for that and overwritten the data in a table with the same filename on a different SQL instance – whoops!

If the option is greyed out when you open the menu it may be that your event data is still loading. If you look closely in the above screenshot you can see I have over 8 million events captured by this session, so it took a while to load before I was able to export.

Once the export has finished you end up with a table that looks a bit like this (the exact set of columns will depend on which fields you have collected):

xe_table2

One gripe I have with this functionality, and that you can see above, is that all the text columns come out defined as nvarchar(max). Apart from anything else this means you can’t index the columns without changing the data type.

That’s one thing that Profiler did better, for instance database_name captured from Profiler would have been an nvarchar (128).

I thought I’d knock this post together while waiting for my statement to complete which alters database_name to the preferred nvarchar(128). With 8 million records that’s taken 21 minutes so far. And then I’ll have to index it.

Yes, maybe it’s my fault for capturing so many events, but in my defence, I’m trying to do some analysis in a dev environment to work out what I can filter out safely, before passing the same session to a client to use in production to diagnose the problem with a long running SSIS package.

Still, as you can see it’s an easy to process to export your Extended Events data to a table so you can do your analysis there – rather than having to resort to querying XML.

 

Using the built-in System Health session

When Microsoft introduced Extended Events (XE) in 2008, they also gave us a built-in XE session called system_health (though it’s worth noting that in 2008 MS hadn’t yet provided us a GUI for this so it becomes most useful in 2012 and beyond).

This is a great little tool. I mainly use it for troubleshooting deadlocks as it logs all the information for any deadlocks that occur. No more having to mess about making sure specific trace flags are enabled to ensure deadlock information is captured in the error log.

It also captures the SQL text and Session Id (along with other relevant data) in a number of other scenarios you may need to troubleshoot:

  • Where an error over severity 20 is encountered
  • Where a session has waited on a latch for over 15 seconds
  • Where a session has waited on a lock for over 30 seconds
  • Sessions that have encountered other long waits (the threshold varies by wait type)

There are other events captured too, you can see the full list here:

https://docs.microsoft.com/en-us/sql/relational-databases/extended-events/use-the-system-health-session?view=sql-server-2017

You can find the system_health session here in SSMS under your server instance:

system_health1

Just double click on the event file to view historical data.

The view that comes up immediately can seem non-intuitive to work with (do I just have to scroll through thousands of events looking for the one I want?):

system_health2

If you know the type of event you are looking for though, you can right-click on the “name” column and select the option to group by the values in that column. Then you see something more like this:

system_health3

With the example of looking at deadlocks (what I mostly use this for) I can then just expand that group and look for the one I want.

Or you can right-click and use “Find in Column” – or “Choose Columns” to add extra columns you might want to search in. For instance, I might want to see if it’s captured any information about why my backups are being delayed so I can add the “sql_text” column, order by that and then search for “backup”:

system_health4

Once an event is selected it will show me the additional information gathered in the bottom pane:

system_health5

Like I say, it’s pretty useful. My only issue is that by default it captures only 20MB of data, which on a busy system can mean events are only kept for a couple of days. So, I often want to increase the retention. I find the easiest way to do that is to right-click on the session and select “Script as CREATE to New Query Window”. I then edit the script to change the number of roll-over files to 20 (from 4):

system_health6

You can then delete the existing system_health session and re-create it from the script – you do have to remember to right-click on it in the GUI and start it again.

One great thing is that when you do this is that you don’t lose the events already saved to file, as the files are retained and continue to be accessible from your newly created session.

All in all, a handy little tool.

There’s Still a Place for SQL Server Profiler

Follow a few of the SQL Family on Twitter and you’ll mostly see one view regards SQL Profiler, and it’s not generally friendly.

So much so that I’ve been tempted to buy this t-shirt for some trolling at the next SQL conference I attend:

I Heart SQL Profiler

 

At SQLBits earlier this year, Erin Stellato ran a great session “Kicking and Screaming: Replacing Profiler with Extended Events”. Erin’s run it a few times at various conferences and you can check it out on YouTube:

https://www.youtube.com/watch?v=HnqLGJuRgvw

As a stubborn continuing user of Profiler I attended because I liked the kicking and screaming bit, being willing to accept that might fairly accurately describe my stance.

Erin ended the session asking for a commitment from the audience that next time they used profiler – just put aside 15 minutes to try seeing what it’s like to use extended events instead.

I’d used Extended Events (XE) on occasion, but generally when it was something that couldn’t be done in Profiler. I also regularly make use of the built-in system_health Extended Events session, particularly for retrieving details of deadlocks.

But if I needed a quick trace for something – to profiler I go. In my defence, I’m not a production DBA, so I’m checking things on dev and test instances where performance isn’t critical. But even where performance is a concern you can define the trace in profiler and then export that to implement as a server-side trace where the performance impact is minimal.

As a side story here, profiler can slow things down as it needs to write events back to the app before each event can complete. A Microsoft consultant shared a story with me where they’d been called out to troubleshoot a client who was experiencing high CPU on the database server. The MS guy started a quick trace to see what was going on, and as that slowed queries down it actually reduced the CPU. The manager ran out yelling “What have you done – it’s fixed!” There had to be a fair amount of convincing to persuade the manager that leaving profiler running permanently wasn’t a solution to the CPU problems.

Back to the subject of the post. There are a couple of reasons why you may still want to use profiler (apart from where you’re running seriously old versions of SQL):

  • When you work with clients, it’s a technology they’re all familiar with. I can ask a client to run a trace and send me the results. I don’t have to explain in detail what to do so it’s easy to get what I need. If I asked for the results of an XE session I’d generally get the verbal equivalent of head scratching – they’ve heard of it (probably) but wouldn’t know how to go about it. They may also wish to go through a change process to implement one as they don’t understand well what the impact might be.
  • I want something I can replay. You can’t do this with extended events, but you can if you take the right sort of trace. Moving forward you can use “Distributed Replay” for this, but I don’t know anyone who’s done that.
  • You’re using Amazon Relational Database Services – with RDS you have restricted rights over your database instance, and the ability to create extended events sessions is something you don’t have access to.

There’s also a couple of reasons people say you shouldn’t use traces that I disagree with:

  • Performance. I’ve already mentioned this. Yes, XE is more lightweight, but if you implement your trace as a server side one then the impact is small.
  • Profiler is deprecated. Well, yes it is, but only from SQL 2016. The ability to not have to terminate your SQL statements with a semicolon has been deprecated since 2005 and we can still do that. And as Hugo Korenelius informs me, there are features that have been deprecated even longer and are still with us. Trust me, trace will be with us for a good while. I’ll happily take a wager on that if you don’t believe me. I might even wager that I’ll be retired before Profiler is.

Profiler_twitter

However, since making the pledge at Erin’s session I forced myself to try Extended Events for some of the more basic tasks I would have turned to Profiler for and I feel there are some reasons people (myself included) haven’t been using XE that are no longer valid:

  • It’s quicker/easier to set up a profiler trace. That was true in the early days of XE, but arguably now (depending on what you want to do) you can have your XE session going with remarkably few clicks.
  • XE produces XML output. I hate XML, I don’t want to start using XQuery to find what I want. Trust me, we all hate XML. With the GUI we now have for XE you don’t have to mess with that stuff.
  • With XE I can’t save my results to a table. Um, yes you can. A table, a csv file (with no XML I promise) or just copy and paste to Excel. And unlike profiler you can search the data without having to export it anywhere.

 

So yes, I believe that there is still a place for profiler, but no longer because it is quicker, simpler and more usable.

In subsequent posts I plan to show easy and convenient XE is to work with these days.

And yeah, okay it is more powerful, more lightweight, allows me to look at things profiler never could. People have been telling me that for years and it wasn’t enough to make me change – but “easier”? Give me that thing!

Still probably going to buy that t-shirt though…

Alter Multiple Databases at Once

This is a quick and dirty method I often use when I want to make a change to multiple databases on a SQL Server instance, usually based on a criteria.

It’s a fairly basic level thing to do, and while it is probably trivial to a SQL expert, I find most beginners wouldn’t consider it. So, I thought it was worth sharing.

Let’s say that I have the following set of databases:

MultipleDBs1

I’ve got a set of databases suffixed _dev, and another set suffixed _uat, as well as a couple of others.

Let’s say I need to take the _uat databases offline. I could do that one at a time through the GUI in SSMS. Or I could write SQL statements to do that.

Or, I could write a query to generate the SQL for me. In this case I only have three databases so it’s not going to save me a lot of time, but when you have tens or hundreds…

First, I write a query to make sure I can select out just the databases I want:

SELECT name
FROM sys.databases
WHERE name LIKE '%_uat';

And I view the results:

MultipleDBs2

That’s what I wanted. So, the next stage is to alter that query to generate the SQL statements to take those databases offline. The SQL to set a single database offline is:

ALTER DATABASE [{Database Name}] SET OFFLINE;

So, I alter my query as follows:

SELECT 'ALTER DATABASE [' + name + '] SET OFFLINE;'
FROM sys.databases
WHERE name LIKE '%_uat';

Which gives me:

MultipleDBs3

I can then copy that results into my query window and execute it as a block:

ALTER DATABASE [Users_uat] SET OFFLINE;
ALTER DATABASE [App_uat] SET OFFLINE;
ALTER DATABASE [Config_uat] SET OFFLINE;

Usually I’ll execute the first line initially to make sure I haven’t messed up the syntax and then I’ll run the rest in one go.

If we look at the list of databases again in the Object Explorer you can see it’s done the job:

MultipleDBs4

You can get clever and create a cursor to execute the result-set one at a time using dynamic SQL, but if you’re just performing a one-off task and you want the quickest solution then I find this the best method.

You can use this technique for all sorts of ad-hoc admin tasks, so it’s a very useful little tool to have in your belt.

Also, you’re not just restricted to performing administrative actions on databases, it could be any set of SQL objects. Personally I find doing things using methods like this a lot easier than messing about with PowerShell.

Cycle Through Your Clipboard History In SSMS

This is my favourite SSMS trick I’ve discovered recently, probably some time towards the end of last year.

Basically, when you paste in Management Studio, you have not just the option to paste the last thing you selected and copied, but can cycle back through previous things that were in the clipboard. In a quick test I was able to choose from the last 20 clipboard entries.

You can find a menu item for this in the “Edit menu”:

ClipboardRing

Or as you can see in the menu (and much more easily) you can use Ctrl+Shift+V

Here’s an animation of me using it to cycle through the SQL statements from my last blog post:

ClipboardRing

This is a great little productivity tool. I was forever copying and pasting one thing, then a second, then having to go back to copy and paste the first thing again. Never again!

 

The Importance of ORDER BY

Everyone, at the beginning of their SQL career, get’s told that it is important to include an ORDER BY if they want the results ordered. Otherwise the order in which they are returned is not guaranteed.

But then you run queries a lot of times that don’t need a specific order – and you see that they (at least seem to) come out in the same order every time. You could (almost) be forgiven for thinking you can rely on that.

There was even a question on a Microsoft SQL certification exam a few years ago that asked what the default order was for records returned by a simple SELECT – the answer it was looking for was that it would be according to the order of the clustered index. So you can rely on that – right?

Wrong. The question was a bad question, and the answer was incorrect. Let’s look at this in action. First I’ll create a table and add some rows:

CREATE TABLE dbo.NoSuchThingAsDefaultSortOrder (
   Id INT IDENTITY(1,1) 
      CONSTRAINT PK_NoSuchThingAsDefaultSortOrder PRIMARY KEY CLUSTERED,
   FirstName VARCHAR(30)
);

INSERT INTO dbo. NoSuchThingAsDefaultSortOrder (FirstName)
VALUES
('John'),
('Fred'),
('Bob'),
('Sue'),
('Jenny'),
('Matt'),
('Alexis'),
('Zebedee');

Now let’s SELECT from the table, without specifying an order:

SELECT Id, FirstName
FROM dbo.NoSuchThingAsDefaultSortOrder;

Here’s our results:

OrderBy1

Okay, so they’ve come out ordered by Id according to our clustered index.

But what if someone comes along and decides they want to be able to look up records in the table by name, so they create a non-clustered index for that:

CREATE NONCLUSTERED INDEX IX_NoSuchThingAsDefaultSortOrder_FirstName
   ON dbo.NoSuchThingAsDefaultSortOrder(FirstName);

We run our original query again and this time these are the results we get:

OrderBy2

They’ve come out in a different order – this time ordered by FirstName – like the index we just added.

The reason is that this time SQL decided that it would be better to use our new index to execute the query. In general, it will try and use the smallest index that will satisfy the query – and in this case it decided that was the non-clustered index. I’d show you the execution plans to prove that, but I think it’s pretty obvious from the order the results are returned in.

So, don’t go relying on there being a “default” sort order. It’s just not true, changes to indexing – or even conceivably to statistics about the data distribution, could change the way SQL chooses to execute your query, and that could change your order.

Unless you make sure to specify ORDER BY.

 

 

 

Thoughts on Query Performance with TDE enabled

Microsoft state that enabling TDE (Transparent Data Encryption) usually has a performance  overhead of 2-4%. That doesn’t sound like very much, and personally I wouldn’t let it bother me if I want to make sure my data is encrypted at rest.

However, you may have heard other sources saying that it’s actually a lot more than that – and the performance impact is a high price to pay for the level of protection offered.

So, what’s the truth?

The critical thing to remember is that with TDE your data is encrypted on disk, but data held in memory (i.e. the buffer pool) is unencrypted. Therefore, you would only expect an overhead when reading from and writing to disk.

SQL Server tries to keep data that is referenced repeatedly in the buffer pool. So, if your SQL instance is provisioned with enough memory, a lot of your read queries can access the buffer pool and don’t have to go out to disk. Such queries should not be affected performance-wise by TDE.

There may be other read queries however that access older data that hasn’t been read for a while, these queries would need to retrieve that data from disk and so there would be an overhead from TDE.

Any queries that modify data will need the outcome to be written to disk so in these cases we will see an overhead. This overhead is likely to come in two parts, first when the transaction is written to the logfile before committing, and then later as the updated data gets written to the data file as part of a checkpoint operation.

We also have other operations that write or update encrypted data on disk, so we would also expect these to have some overhead. This would include operations such backups, or index rebuild operations.

You can see from this that the overhead will very much depend on how your application interacts with your data. At one extreme, if you have a set of static data that is small enough to be held in memory and is queried regularly then there should be no overhead. At the other end of the spectrum, if you have an application that writes to the database a lot, and reads less often, then the overhead will be higher.

Let’s look at some examples and try and quantify what amount of overhead we might be talking about. In this post we’re just going to focus on the effect TDE has when you are reading data.

First, I’ll create two databases, one with TDE enabled and one without. Then I’ll load the same set of data into each (Total size about 1GB).

You can find the script I used for this in my previous blog post:

Encrypting an existing database with TDE

In the first test we’ll perform a like query of the worst kind, one that tries to match for a value within a column. We have no indexes on the table, but none would be that helpful with this query anyway.

SELECT *
FROM dbo.SomeData 
WHERE SomeText LIKE '%Wibble%';

I’ll run across 4 test cases (capturing the total CPU consumed in each case). The test cases are:

  • TDE Protected database where the buffer cache is empty (i.e. all data has to be read from disk)
  • TDE protected database where all the data for the table is in the buffer cache (i.e. no data has to be read from disk)
  • Database without TDE where the buffer cache is empty (i.e. all data has to be read from disk)
  • Database without TDE where all the data for the table is in the buffer cache (i.e. no data has to be read from disk)

In each test I’ll run the query 5 times and total the CPU to even out variance between executions. For the tests involving disk reads I’ll run the command DBCC DROPCLEANBUFFERS in between executions of the query to empty the buffer cache.

The results looked like this, with time shown in seconds. Note that MAXDOP was set to 4 and each query went parallel over 4 threads:

TDE_ReadPerf1

There’s quite a variance between each run so I’m not going to take anything significant from small differences we see. However, we can see that the timings are pretty much the same when the data is in memory, but there seems to be about a 10% overhead with TDE when reading from disk.

In case you’re wondering why reading from disk didn’t add much elapsed time for the No-TDE database – the reads were “read-ahead” so were able to complete while the CPU was burning through the data.

Let’s try a different query, this one will still have to scan the whole table as we are dealing with a heap, but it uses an equality predicate so there is less work to do in matching the data:

SELECT *
FROM dbo.SomeData 
WHERE Id = 100000000;

I’ll run the same set of tests as above and we can look at the results:

TDE_ReadPerf2

The first thing we notice is that this query runs a lot quicker in general. And again, there is little difference with and without TDE when the data is in memory.

But look at the comparison when the data has to be read from disk. With TDE the CPU consumption is more than 10 times as large, or in percentages, over 1000% worse.

At this point you could be forgiven for panicking – are you willing to risk that TDE makes your queries that much worse.

In both the above two tests, the same amount of data is being read from disk. And if you re-examine the numbers, you’ll see that (very roughly) the same amount of CPU has been added in each case where we have TDE enabled – about 50 seconds. That 50 seconds was split over 4 cores so it would have been about 12.5 seconds per core.

In terms of elapsed time, we had approximately that increase with the first query because CPU was the resource under most contention – i.e. the reads were able to occur while waiting for the CPU to complete. In the second query we can see the reading from disk occupied most of the elapsed time for those queries, so the extra CPU consumption didn’t make the query run particularly longer.

By the time it had been executed 5 times (with the memory flushed between each execution) each query read about 600,000 pages sized at 8kb each – just under 5GB. If it took 50 seconds on the decryption of those pages, then each page took about 1 twelfth of a milli-second to decrypt – or alternatively, TDE decrypted about 12 pages per millisecond. Or in terms of disk size, 100MB per second. These were tests on a server with magnetic spinning disks (not SSDs) and you can see from the above figures, the straight disk access took about 40 seconds on its own.

When TDE doesn’t read from disk it doesn’t add any overhead, but how do we quantify what the overhead to queries is when it does have to access the disk?

From the above tests we could suggest it adds from 10% to over 1000% CPU.

Or alternatively between 10% to 0% elapsed time. Note that those figures are the same way round, i.e. when it added 10% CPU it added 10% elapsed time, but when it added 1000% CPU time – the elapsed time was about the same.

I could go on with this type of confusing analysis, but instead I’ll suggest this is the wrong way to think about performance in terms of TDE.

Don’t think about query performance.

Think about read performance.

TDE overhead depends on the level of your physical disk access. In the case of read query performance, it depends wholly on the level of physical reads, and seems to be a reasonable fixed overhead for each physical read.

That makes perfect sense, the overhead for querying is in decrypting data, and surely it will take pretty much the same amount of CPU to decrypt each 8KB page.

In theory this makes it simple for us to calculate what this overhead would look like on our production SQL Servers. There’s all sorts of ways of capturing physical reads (and writes).

Let’s say I take a quick look at Resource Monitor to get a ballpark figure for one of my databases on this server that I know to be quite heavy on physical reads. And let’s say I see that it averages 25MB/s during the peak hour.

From that, and from the figures above I can estimate what impact enabling TDE for this database would have on the CPU.

Well I know that 25MB equates to about 0.25 seconds of CPU to decrypt the data, and I know I have 4 cores, so I can expect that in the average second this adds 0.0625 seconds of CPU per core. I multiple that be 100 and I find that I’ve added 6.25% CPU.

The calculation I’ve just done is:

(Reads/Second) * 100

Divided by

(MBs TDE decrypts every second/CPU) * (Number of CPU cores)

This doesn’t include writes, and it doesn’t include backups – I hope to look at that in a later post.

Now, let’s say that produces a scary number, and I’m worried about the strain that’s going to put on my CPU…

My first question would be why am I experiencing so many reads and can I alleviate that? Does data have a short shelf-life in memory? Do I have enough memory in my server – and is enough allocated to SQL?

This isn’t just in terms of TDE. SQL Server is going to perform much better if your current dataset – i.e. the data you are currently accessing most, can be held in memory. So, if TDE is causing a problem, then it’s possible your queries are slow anyway.

Again, I’m not talking about writes just yet.

Or maybe your database is heavy on physical reads because it’s a data warehouse, regularly querying historical data. In that case, is it a suitable target for encryption? Hopefully the data is well anonymised if you’re using it for reporting and therefore doesn’t contain anything personal or sensitive.

I hope.

In summary…

Just to repeat myself, if you’re wondering about TDE and its impact of query performance, which we all have done, try to reframe the question and think about its impact on read performance.

It makes more sense and it may help you to more easily quantify the impact on your servers. And if it does look like the performance may be an issue – perhaps there is tuning you can perform on your database instance to reduce the physical disk access.

Database Files Down The Wrong Path

I manage a few servers used to host SQL Instances for development and test purposes. Each of those instances hosts databases covering multiple environments. So I’ve got multiple servers, with multiple instances, with multiple environments.

It’s important that issues in those environments don’t block development tasks, or invalidate or block testing cycles, so I like to keep them a little bit locked down. Code changes into those environments are generally made through builds and releases.

Equally though, sometimes developers need to be able to make changes that require sysadmin rights, without waiting for me to have time to help – or I might be on holiday.

Consequently, I’ve had to give elevated permissions to a few “trusted” devs. Result – bad things happen occasionally.

One common issue that bugs me is where databases have been moved from one instance to another, usually through backup and restore, and the files haven’t been moved as part of the restore so they get recreated in the data\log folders for the old instance.

This has caused me various problems, e.g. working out which instance is hammering the disk or using up all the space.

You can avoid this using the tick box in the Files page in the Restore dialog (if you’re restoring through the GUI):

FilesDownWrongPath

I’ve written a quick script to identify any database files that are suffering this and identifying the user responsible for the restore (where this is available). I thought I’d share in case anyone else suffers from the same problem:

--Check for database files down the wrong path
DECLARE @DefaultDataPath VARCHAR(512);
DECLARE @DefaultLogPath VARCHAR(512);

SET @DefaultDataPath = CAST(SERVERPROPERTY('InstanceDefaultDataPath') AS VARCHAR(512)) + '%';
SET @DefaultLogPath = CAST(SERVERPROPERTY('InstanceDefaultLogPath') AS VARCHAR(512)) + '%';

SELECT @DefaultDataPath AS CorrectDataPath, @DefaultLogPath AS CorrectLogPath;

SELECT 
   SUSER_SNAME(d.owner_sid) AS Culprit, 
   d.name, 
   f.type_desc, 
   f.physical_name
FROM sys.master_files f
INNER JOIN sys.databases d
   ON f.database_id = d.database_id
WHERE d.database_id > 4
AND d.name != 'SSISDB'
AND (
      (f.type_desc = 'ROWS' AND f.physical_name NOT LIKE @DefaultDataPath)
   OR (f.type_desc = 'LOG' AND f.physical_name NOT LIKE @DefaultLogPath)
);

System databases and the SSIS catalog are excluded.

For once I won’t show the results of the script in action – I am however off to have a quiet word with a few culprits!

T-SQL Tuesday #101 – Some Great SQL Server Tools

tsql2sday150x150

This month for T-SQL Tuesday #101 Jens Vestergaard asks us to blog about the essential tools in our SQL Toolbelt.

http://t-sql.dk/?p=1947

I’d just completed by post on CMS when I realised I’ve blogged about a few of my favourite tools in the past and that this would be a good opportunity to share them again in a quick post. So, here’s my second of two T-SQL Tuesday posts.

First we have these two great tools:

Statistics  Parser – If you do performance tuning and you’re not using this then you’re missing out. Formats and summarises the results of your STATISTICS IO and TIME commands. Big shout to Richie Rump for creating this – I use it so much I just need to type “st” in my browser search and it comes straight up!

Live Query Stats – Watch an animated execution plan while your query executes. I’ll admit I’ve not used this a lot but I still think it’s really cool tool.

And then a few different ones around the same tool:

Query Store – track query performance and plan changes over time as well as forcing SQL to use the plan you want. I’m lucky to have quite a bit of stuff on SQL 2016 and Query Store is great! I was just using it minutes ago to look at how a performance fix put in last week has improved things.

Clone Database with Query Store – If you want to see how your queries would behave in your production environment then Clone Database copies out the stats used to generate the plan. With SQL 2016 you can also a copy of your query store data so you can query away at it happily in your dev environment.

Capture the most expensive queries across your instance with Query Store – blowing my own trumpet a bit here, but I’ve used this script a lot since I wrote it.

That’s it for now – at least for tools I’ve written about – I look forward to seeing what the rest of the T-SQLT community comes up with.

T-SQL Tuesday #101. CMS – Effortlessly run queries against multiple SQL Servers at once

tsql2sday150x150

This month for T-SQL Tuesday #101 Jens Vestergaard asks us to blog about the essential tools in our SQL Toolbelt.

http://t-sql.dk/?p=1947

The concept of a Central Management Server (CMS) is one I’ve been meaning to blog about for a while – just because I get the impression not a lot of people know about it or use it even though it’s been around since at least SQL 2008.

A central management server is simply one of your SQL instances on which you can register a group or groups of other instances. You can then run queries against those groups as a set, it’s a very easy way of being able to run diagnostic queries – or deploy common objects across a group.

There are other ways of performing these sorts of tasks – such as using Powershell, but with a CMS it’s really quick and easy, and all you have to know is SQL.

I’ll run through how you set one up, then give a couple of examples of using it.

Setting up your CMS

Go to the View menu in SSMS and select Registered Servers.

Then in the Registered Servers window, expand Database Engine, right-click Central Management Servers and click Register Central Management Server.

CMS1

In the New Server Registration dialog select the instance you want to use as a Central Management Server.

CMS2

You just need to put in the connection details – though you can also give it a friendly name in the Registered server name field.

It’s worth noting that this server isn’t going to be doing any particularly heavy lifting, it’s just going to store registrations for other servers, so there’s no need to use a dedicated instance for this.

You’ll now see this instance listed under Central Management Servers in the Registered Servers window.

You can right-click on it and either register individual servers underneath it or create one or more groups beneath which you can register servers. You can even nest groups if you want.

Here’s what it looks like after I’ve registered a few servers / groups:

CMS3

And that’s it setup – told you it was easy.

Running Queries against multiple instances

This is the clever bit. I can run a query against an individual SQL instance or against a group and any sub-groups.

I want to run something against all my instances in both groups, so I right-click on My CMS Server and select New Query.

CMS4

Let’s have a look at the query window as there’s some differences:

CMS5

You can see the status bar at the bottom is pink, that tells me I’m connected to multiple instances via a CMS. On the bottom left you can see I’m connected to 6 out of 6, i.e. I tried to connect to 6 instances and all are successfully connected.

Now let’s run a query. Suppose I want to check MAXDOP across my instances:

SELECT name, value_in_use
FROM sys.configurations
WHERE name = 'max degree of parallelism';

Here’s the results:

CMS6

You can see the queries against each instance have been combined into one result set and a column added for the ServerName (the friendly name I specified).

In the messages tab, we also have the output from each query:

CMS7

There’s no limit to what you can do with this, as long as you can express what you want to do in T-SQL, you can execute it easily across as many instances as you want.

I use this for deploying updated versions of Ola Hallengren’s procedures across multiple instances, or the First Responder Toolkit from Brent Ozar Inc.

I use this for running checks across multiple instances like the MAXDOP one above. Earlier, I used it to check for any databases that didn’t have PAGE_VERIFY set to CHECKSUM, and then I used it to enable that across the databases where it wasn’t set.

I’ll finish with another query to change my server configurations. Having run the first one to look at MAXDOP I’ve decided I want to set it to 4 for all the servers where it’s not set:

IF (SELECT value_in_use FROM sys.configurations 
    WHERE name = 'max degree of parallelism') = 0
BEGIN
   EXEC sp_configure 'show advanced options', 1;  
   RECONFIGURE;  
   EXEC sp_configure 'max degree of parallelism', 4;  
   RECONFIGURE;  
END

Here’s the output from the messages tab:

CMS8

(The really sharp eyed amongst you will notice SQL2012_Local is missing, that’s because I tested the query on that one first).

And if I run the original query again to check the settings:

SELECT name, value_in_use
FROM sys.configurations
WHERE name = 'max degree of parallelism';

CMS9

Sorted!

Summary

Content Management Servers are an exceedingly simple way of performing admin tasks or checks against a whole bunch of SQL Instances at once.

Once you’ve set it up, anyone else with access can connect to it run queries – or add other servers and groups.

There’s still a place for Powershell, particularly in automation of tasks. But when you’re doing tasks that are a bit more ad-hoc then using a CMS is perfect.