Why you still need DBA resource in the cloud

I was chatting with a cloud consultant who was advising on a large scale migration to AWS. He told me that one of the advantages of going for a PaaS offering (Platform as a Service) was that DBAs were no longer required as backups and restores were handled for you. PaaS services for SQL Server include AWS RDS and Azure SQL Database or Azure SQL Managed Instance.

I found it quite a funny conversation, partly as I don’t think he realised being a DBA was part of my job role, but also because I don’t know a single DBA who spends a significant amount of their time doing backups and restores.

It’s true that the services mentioned handle a lot of the routine tasks for you and that is great, it’s also reasonable to hope that they manage databases with greater reliability. Rather than replace the DBA though, they instead free them up to achieve more.

One common pitfall I’ve seen with cloud implementations is to assume that everything is done for you and that can lead to poorly configured services or ones with missing maintenance. It remains important to have access to someone who has a deep understanding of SQL Server when provisioning a cloud service to host your databases. It’s also important they understand the cloud options available and can make sure you have the right settings and instance type etc.

And once things are running you still need someone to turn to when there are problems. These problems are less likely to be hardware related, but you are still going to see occasional performance (or other) issues, just as you would when you were running on-premises hardware.

When those things happen you need access to someone with the right skills to diagnose the problem and perform analysis to define the right solution. Enter the cloud DBA.

As mentioned in my previous post The Importance of SQL Server Performance Tuning in the Cloud, ongoing performance tuning becomes even more important in the cloud as database services are expensive, a performance optimized database is also a cost-optimized one, and you can realise saving immediately.

Brent Ozar talks about the importance of DBAs in the cloud in this post https://www.brentozar.com/archive/2019/06/what-happens-to-dbas-when-we-move-to-the-cloud/. He tells a story that I’m going to quote here:

One of my favorite recent examples was a company who came to me saying, “We’re spending about $2M per year in the cloud just on our databases alone. Can you help us reduce those costs?” Absolutely: with just a couple of days spent query & index tuning, we chopped their biggest database expenses in half while increasing performance.

At the end of that engagement, the CTO told me, “I thought I’d save money in the cloud by not having a DBA, but what I’m learning is that in the cloud, I actually get a return on my DBA investments.”

And it’s not all about performance. Someone recently reached out to me to say that they had mysteriously “lost” about 400GB of free space overnight on an AWS RDS instance running SQL Server, and wanted help identifying the cause. That led to identifying an issue relating to a recent code change that needed fixing.

Of course it’s possible you might not need a full-time DBA in the cloud – or otherwise. That’s where it’s useful to have access to someone who can jump in to perform that role as required. Ideally as they work with you over time they will understand your systems and be able to talk in the context of your applications when discussing problems and solutions.

I can help you with all that. You can find more details of my services here or simply get in touch to discuss what I can do for you.

Measuring SQL Query Performance

When running workshops on the subject of SQL Query Performance the first question I always ask is how do we measure the performance of a query? Or to look at it another way  – how do we measure the resources it consumes?

Attendees have come up with a range of answers from “With a stopwatch” (which I like a lot) to the slightly more technical “Using Profiler”. I like the first answer just because that’s where we all start. We run something from an application and we literally time how long it takes, or we run something in SSMS and we use the counter near the bottom right of our query window to tell us how long it took. We may even run something 100 times in a loop and capture the overall time so we can take an average.

SQL Server Profiler is also good when trying to capture what is actually happening in a given environment.

But there is a better way…

Using STATISTICS commands

When we’re working on a query in SSMS there is an easier, more accurate, and more detailed approach than any of the others available, which is to use the following two commands:

SET STATISTICS TIME ON;

This command tells SSMS to capture the consumed CPU time and elapsed time related to any queries you subsequently run in that query window. The output is sent to the Messages tab.

SET STATISTICS IO ON;

This command tell SSMS to capture the amount of data read from disk or memory by any subsequent queries in that window. Again the output is sent to the Messages tab.

Let’s look at some quick examples. For these I’ll use the StackOverflow database (here’s a post from Brent Ozar on how you can download this database for your own use https://www.brentozar.com/archive/2015/10/how-to-download-the-stack-overflow-database-via-bittorrent/):

SET STATISTICS TIME ON;
SELECT COUNT(*) FROM dbo.Users;

Output:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
(1 row(s) affected)

SQL Server Execution Times:
CPU time = 422 ms,  elapsed time = 2296 ms.

You can see we now have a fairly accurate measure of how long that query took to run (and how long to parse and compile). The CPU time is how much CPU is used, the elapsed time is how much time the query took overall. You’ll see the elapsed time is much longer the the CPU time, this generally means SQL is waiting for something the rest of the time, perhaps for data to be read from disk or memory, or perhaps something else (like locking and blocking).

Now let’s look at STATISTICS IO for the same query.

SET STATISTICS IO ON;
SELECT COUNT(*) FROM dbo.Users;

Output:
Table 'Users'. Scan count 5, logical reads 24914, physical reads 1, read-ahead reads 24770, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

You’ll see that there’s a bit more info here, let’s run through each piece in order:

  • Table – First we have the name of the table the reads are being generated from – when we have a bunch of tables referenced this can help us to work out where to focus our tuning efforts.
  • Scan Count – In simple terms this is the amount of times the table was accessed. This isn’t always a massively useful metric. In can also be Zero in some specifics scenarios such as searching for one specific value in a  clustered index.
  • Logical Reads – this is the number of data “pages” read from memory. A data page is 8KB in size, so this query has read about 200MB from the User table (which is basically the whole table).
  • Physical Reads – This is the number of data pages read from disk.
  • Read-Ahead Reads – These are also physical reads. In this case pages read from disk, but in theory read ahead of time e.g. while the processor is doing something else and doesn’t need them yet. In practice SQL has started to read these before it needs them, but is still reading them as the query is consuming them so we do end up waiting for the disk to supply most of the data. If we look back at the CPU time and elapsed time above, it is a pretty reasonable assumption that the time taken up waiting for these physical reads accounts for much of the difference between the CPU time and elapsed time.
  • LOB – The columns are repeated for LOB data, which is “Large Object” data such as varbinary(max) columns and other such “blobs”. These get stored outside of the main data page. We’re not going to worry about these too much for the moment.

Physical vs Logical Reads

It is worth knowing that SQL Server (in general) will try to retain data it has read in memory for as long as possible. This is because Logical Reads (from memory) are much much faster than Physical Reads (from disk). You may have heard people saying that SQL Server is very aggressive about using memory, or heard complaints like “SQL Server is consuming all the memory on the server”. This is why. SQL Server wants to do as much querying as possible from memory and save on those expensive physical reads. We allocate a maximum amount of memory to SQL Server and over time it will use as much of that allowance as it can. This is also one of the reasons why querying slows down after you restart SQL Server, all your queries are going to go back to reading from disk until the server has that data in memory once more.

It’s also good to understand that in the numbers above SQL Server is not reading one set of data from disk and another set from memory. Rather it is a two stage process, required data is read from disk into memory (the physical/read-ahead reads) and then the whole set is read from memory (the logical reads). So the number of logical reads includes the number of pages read first from disk.

We can see that effect if we run the query again. This time I’ll include both commands:

SET STATISTICS TIME ON;
SET STATISTICS IO ON;
SELECT COUNT(*) FROM dbo.Users
;

Output:
Table 'Users'. Scan count 5, logical reads 24982, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 546 ms,  elapsed time = 164 ms.

You’ll see our physical reads ( including read-ahead reads) have now dropped to zero. This is because we’ve already read this data from disk the first time we ran the query and SQL has retained the data in memory. Logical reads have stayed about the same. You’ll hopefully also see that our CPU time is reasonably similar at around 500ms, but our elapsed time has dropped dramatically as we no longer had to wait for slow reads from disk.

You might also notice that the elapsed time is now actually shorter than the CPU time which can be a bit confusing. The query has consumed 546ms of CPU time, but only took 164ms from beginning to end. How than this be? The answer is that our query has gone parallel over several CPU cores. Counting all the records in a large table is actually a pretty resource hungry operation as you can see, so SQL has decided to save time by breaking the task into pieces and running those sub tasks concurrently.

I should probably add a caveat here in that I have actually seen STATISTICS TIME return results where the elapsed time is marginally shorter than the CPU time even when the query is single-threaded. I queried this with a performance tuning specialist from Microsoft  who I was working with at one of our clients and he admitted that these figures weren’t always 100% accurate – they are however the most accurate you’ll get.

Flushing Data From Memory

You might sometimes want to see how your query would perform if it’s having to read all its data from disk. This might be more appropriate to how it is likely to be working in the wild. In that case you can run the following command which will flush all the data from memory (so don’t run it on a production environment!).

DBCC DROPCLEANBUFFERS;

If you execute this and then were to re-run the above query and capture the IO you’d see that the physical/read-ahead reads are back.

In SQL Server the area of memory where data is stored is referred to as the Buffer Cache. Within that Cache we have clean pages where the data in memory is the same as the data on disk, and dirty pages which are pages that contain modifications that have yet to be written to disk. The above command only flushes (or drops) the clean pages (or buffers). Just in case you were wondering why it was called DROPCLEANBUFFERS…

A Slightly More Complex Example

The above example with just the one table should make it clear what these commands do, but let’s look at a query that goes across a couple of tables and does a bit more work so you can see the sort of output you’ll get:

SET STATISTICS TIME ON;
SET STATISTICS IO ON;
SELECT TOP 10
u.Id
,
   u.DisplayName,
   COUNT(p.Id) AS NumberOfPosts
FROM dbo.Users u
INNER JOIN dbo.Posts p
   ON u.Id = p.OwnerUserId
GROUP BY u.Id, u.DisplayName
ORDER BY NumberOfPosts DESC;

Output:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 54 ms.
(10 row(s) affected)

Table 'Posts'. Scan count 5, logical reads 51352, physical reads 1, read-ahead reads 51192, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Users'. Scan count 0, logical reads 94, physical reads 45, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 336, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 8002 ms,  elapsed time = 5597 ms.

You can see here that the IO for both tables referenced in the query has been captured. This is very useful in telling us where the bulk of the work is being done.

Notable here also are the references to “Worktable” and “Workfile”. These often appear in the output to STATISTICS IO for more complex queries. In general it is worth thinking of these as internal objects SQL uses to store temporary resultsets while processing your query. The difference between the two is the type of resultsets they store and for which particular operations. This can make it a bit more challenging to work out which part of your query is generating the most work when the highest number of reads appear against these. You end up needing to look at the execution plan and trying to work out what operations these could be related to. We may get to more of that in a later post.

Using this information to Tune your Queries

To close, I just want to mention how you should use this information to focus your efforts when tuning queries. Intuitively we all think in terms of time when we think in terms of query performance, but in practice it is much better to focus on the output of STATISTICS IO when you are trying to make improvements.

The IO gives you much more information, and while figures for both CPU and IO will vary depending on what is going on on the server, the IO figures are more static and therefore you can have more confidence that changes you see are the results of your tuning efforts. It also allows you to focus on tables one at a time as you tune multi-table queries.

Both IO and TIME are measures of work being done on a server. If you can reduce the number of reads generated by your query then that is generally the best measure of improvement and will likely have similar benefits to the overall time taken by the query.

So in your tuning efforts, focus on the logical and physical reads and try to get the totals for these down as low as possible. Generally you’ll start with operations on the table that generates the highest number of reads and go on from there.

Need Help?

Got a problem and could do with some assistance? Get in touch today to see how I can help.

Top Tools for SQL Server Performance Troubleshooting

When you’ve got the symptoms of a database issue you can run a series of diagnostic queries to try and drill down on the problem and then start figuring out solutions. There are a number of free tools out there though that can speed up that process immensely. In this post we’re going to look at my favourites.

In no particular order…

sp_WhoIsActive

sp_WhoIsActive by Adam Machanic is great for the situation where you have a problem right now. It’s versatile and supports a number of parameters for execution but most of the time I run it with just the defaults and it tells me what I need to know. The resultset returned shows you what is executing on your SQL Server without any noise from idle connections etc.

From the output you can quickly identify long running, or CPU consuming queries. You can also identify queries whose execution is blocked and analyse the blocking chain to find the root blocker.

sp_WhoIsActive is a stored procedure that you install on your database instance. Either in the master database, or another database of your choosing.

You can download it and find the documentation here.

Query Store

Query Store is built in functionality that has existed in SQL Server since 2016. Once enabled it captures query execution information over time so you can see what was running in a particular time window, look at what are your most resource intensive queries, as well as identify regressed queries (queries that were good, but have gone bad). You can also use it to “fix” some issues by forcing a known good execution plan – and in SQL Server 2022 you can use it to force query hints too.

I’ve blogged a fair bit about Query Store and my Introduction to SQL Server Query Store post is one of my most popular posts. Give it a read if Query Store is unfamiliar to you.

Query store isn’t going to help you if you haven’t turned it on, and it’s a setting you should consider as default for each database. From SQL Server 2022 it is enabled automatically.

The only downside with Query Store is that it is a per-database view and if you have a lot of databases it won’t easily help you determine which one a problem is coming from. That’s why I wrote the script Identify the (Top 20) most expensive queries across your SQL Server using Query Store.

Paul Randall’s Wait Stats script

This is a great script for identifying what exactly is troubling your SQL Server. Is it CPU, is it Parallelism, is it blocking or memory etc. It analyses the aggregated waits (what SQL Server has been waiting on) since the last server restart. This sort of analysis is often the first thing you should be looking at when you have a problem. Paul used to work for Microsoft in the SQL Server team and is one of the foremost experts on all things SQL Server. You can get the script here as well as find comprehensive documentation on most of the wait types you are likely to find.

Brent Ozar’s First Responder Kit

Originally written by Brent Ozar (he has since open-sourced it) the First Responder Kit covers a lot of ground. It’s a set of stored procedures you can use in different scenarios. Like sp_WhoIsActive, you can install them in the master database or another database of your choosing. The most useful for problems that you are in the middle of is sp_BlitzFirst which encapsulates some of the other items above. It includes:

  • Identifying currently long running, high CPU or blocked queries.
  • Analysing wait stats.
  • Checking for currently executing SQL Server agent jobs.
  • Looking at perfmon counters.

As it covers a number of areas it is (as the name suggests) what you should consider running first when you have a problem.

Another very useful stored procedure in the toolkit is spBlitz which gives an overall healthcheck of your server and is a useful starting point for identifying all sorts of things that are going wrong or aren’t correctly configured.

You can get the First Responder kit here.

Need Help?

Got a SQL Server performance problem and need some assistance? Get in touch today to discuss how I can help you.

TDE: Database Backups and High Availability

Database backups continue to work without change when you have TDE enabled. The only difference is that the backups contain encrypted data that cannot be read without the certificate and private key. There are a couple of points that are worth discussing though.

Backup Performance

Following on from general questions of TDE performance, it’s sensible that you might also be concerned whether TDE has an impact on backup times. You may also have read of people complaining about long backup times with TDE.

It’s not necessarily true that TDE has much of an impact on backup performance. The reason is that when a backup is performed, SQL Server does not have to encrypt the data. The data already sits encrypted on disk in the data and log files, and those copies of the data are what are used for performing the backup. In practice there may be some data in memory that has yet to be encrypted and written to disk, but in general, that is not going to be large enough to cause significant overhead.

When people talk about issues with backup performance and TDE, they are likely to be talking about the case involving backup compression.

Backup Compression

Many people use backup compression with database backups in SQL Server. It is simple functionality to use as shown in this code example:

BACKUP DATABASE [TestTDE] TO DISK = 'C:\Test\TestTDE_Compressed.bak' WITH COMPRESSION;

The benefit of backup compression isn’t just about having smaller backup files but also in the time taken to perform a backup. The biggest bottleneck involved in taking a backup is usually the time it takes to write it to disk. By taking compressed backups you can significantly reduce backup takes. This comes at the cost of some extra CPU overhead to perform the compression, but unless your CPU is under pressure, it’s often worthwhile.

Up until the 2016 version, SQL Server did not support backup compression on TDE enabled databases. One reason for this may be that most compression algorithms work best where there is some repetition in the data to be compressed, but encrypted data looks pretty much random. What this meant in practice was that you might specify the WITH COMPRESSION option when backing up your TDE-protected databases but you wouldn’t see much difference in the file size or backup times. This changed from SQL 2016 and was a welcome improvement.

To use backup compression with TDE, however, you needed to specify an additional parameter MAXTRANSFERSIZE. This parameter specifies the largest unit of transfer in bytes used between SQL Server and the backup media. If you’re interested in fine-tuning your backup performance, this is one value you can play with. Backup compression with TDE doesn’t kick in unless your MAXTRANSFERSIZE is greater than 64kb (65536). As long as the value you specify is at least one greater than 64k, then an optimized algorithm for compression of TDE encrypted databases is enabled. Commonly people use the value of 128kb. The command looks like this:

BACKUP DATABASE TestTDE TO DISK = 'C:\Test\TestTDE_Compressed.bak'
WITH COMPRESSION, MAXTRANSFERSIZE = 131072;

This extra parameter becomes unnecessary if you are on SQL Server 2019 Cumulative Update 5 or higher. With that release, if you specify WITH COMPRESSION for a backup taken for a TDE-protected database and you don’t specify MAXTRANSFERSIZE, then MAXTRANSFERSIZE will automatically be increased to 128kb, and your backup will be compressed.

Backup Compression Issues

The introduction of backup compression for TDE-protected databases has however not been without problems, and this is something you really need to be aware of. There have been a number of bugs discovered where a compressed backup of a TDE database was found to be unrecoverable. Some people have also reported that restore times were massively increased in some cases.

If you’re on a version of SQL Server higher than 2016 CU 7 or 2016 SP1 CU4, then you should be fine, but I would stress the importance of regularly testing your backups by restoring them. A few days before writing this, I came across the term Schrodinger’s Backup – the condition of any backup is unknown until a restore is attempted. When it comes to TDE and backup compression, you should consider that as a very true statement.

TDE and High Availability

In general, TDE plays nicely with any of the built-in features that SQL Server has for high availability (HA). That includes:

  • Availability Groups
  • Log Shipping
  • Database Mirroring

In theory, in all cases, the actions you need to take to support TDE are the same. You just need to ensure that the secondary server has a Database Master Key (DMK). Then you need to ensure that copies of your certificate and private key have been restored to the secondary before you turn encryption on. This is the same step you would take if you were attempting to restore a TDE-protected database to a different server. We covered that a previous post Migrating or Recovering a TDE Protected Database.

As long as that is done, then whichever HA tool you use should take care of the rest.

In practice, we DBAs are cautious folk, and you don’t want to risk anything going wrong when you are dealing with a live system. As such you may want to take the following steps:

  1. Remove the database from HA.
  2. Set up TDE for the database and turn on.
  3. Set up the keys and certificate on the secondary.
  4. Add the database back into HA.

This post is part of a comprehensive series on SQL Server Encryption. Subscribe to my blog for updates as new posts are published or you can buy my book through the link in the sidebar to get it all in one go.

And if you’re embarking on an encryption project (or anything else to do with SQL Server) and want some help, I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

The Importance of SQL Server Performance Tuning in the Cloud

SQL Server Performance tuning is always important and I can list off reasons why:

  1. Improved application response.
  2. Identifying issues before they become outages.
  3. Potential cost savings.

In terms of the cost savings however, that is always a little speculative when running on premises. I can take your 8 core box running at 70% CPU peak and get that down to 35%. Then you could potentially migrate to a smaller box with 4 cores and save up to $7,000 a year for standard edition or about $26,000 for enterprise. If you’re running on bigger kit then the potential savings are much larger. Often though SQL Server runs on physical machines so downsizing isn’t that straightforward and there isn’t the appetite for migrating to different kit.

In the cloud it’s different, you can change the size of the instance running your SQL Server quickly and easily with no real risk, so cost savings can be realised immediately. That applies whether you’re running SQL on cloud Virtual Machines or using a Platform as a Service (PaaS) offering such as Azure SQL Database or Amazon RDS. If you’re running the serverless option on Azure SQL Database then you can start seeing savings without even having to change configuration of the instance.

Let’s look at some figures showing the potential cost savings. The table below shows the cost of running 4, 8 or 16 vCPU instances in Azure and AWS. For Azure I’ve chosen provisioned compute on the General Purpose Service Tier running on Standard hardware. For AWS I’ve chosen M5 as the instance type with SQL Server Standard edition and Multi-AZ for high availability. The figures shown are on-demand costs per year:

Number of vCPUsAzure SQLDBAWS RDS
16$35,346$88,423
8$17,673$44,641
4$8,836$21,444
Cost per year of running SQL Server in Azure and AWS

You can see the pricing calculators I’ve based these figures on here:

https://azure.microsoft.com/en-gb/pricing/details/azure-sql-database/single/

https://aws.amazon.com/rds/sqlserver/pricing/?nc=sn&loc=4

As a side note, we can see that AWS is a lot more expensive. There are a number of reasons why I see Azure as a better fit for running SQL Server workloads and price is one. That said, the database isn’t the only factor in choosing your cloud provider so it’s perfectly legitimate that you might be running on AWS.

When I’m asked to look at performance of a SQL instance, it’s rare that I can’t knock performance in half and often we can achieve much more. You can see the potential cost savings are significant. One large project I’ve worked on recently called me in where the application was struggling to perform running on 32 vCPUs in AWS RDS. I identified changes that meant it was performing perfectly running on 16 vCPUs, and soon after we were able to move it down to 8. A few more rounds of optimisation and now it can run on 4. That represented a saving of over $150,000 annually.

Get in touch today to discuss how I can help you with your cloud SQL Server performance and save you a lot of money year on year.

Impact of TDE on Performance

Microsoft states that enabling TDE usually has a performance overhead of 2–4%. That doesn’t sound like very much, and personally I wouldn’t let it bother me if I want to make sure my data is encrypted at rest. However, you may have heard other sources saying that it’s actually a lot more than that – and the performance impact is a high price to pay for the level of protection offered. So, what’s the truth?

Where Do We See an Overhead?

When we talk about performance, we are likely to be concerned about two things. One is the impact on query performance. Are my queries going to execute slower with TDE enabled? The other is what overall pressure is going to be added to the server.

The important point to start with is in understanding where and how TDE adds overhead. Encryption occurs as data is written to disk, and decryption occurs as data is read from disk. Each of those activities uses CPU. So, the CPU overhead added by TDE is going to be in proportion to your disk activity. If you have a system that is heavy on IO, then there is going to be more CPU overhead.

SQL Server tries to keep data that is referenced repeatedly in memory (the buffer pool). So, if your SQL instance is provisioned with enough memory, a lot of your read queries can access the buffer pool and don’t have to go out to disk. Such queries should not be affected performance-wise by TDE. There may be other read queries however that access older data that hasn’t been read for a while, and these queries would need to retrieve that data from disk and so there would be an overhead from TDE.

Any queries that modify data will need the outcome to be written to disk, so in these cases we will see an overhead. This overhead is likely to come in two parts: first when the transaction is written to the log file before committing and then later as the updated data gets written to the data file as part of a checkpoint operation.

We also have other operations that write or update encrypted data on disk, so we would also expect these to have some overhead. This would include operations such as index rebuild operations.

You can see from this that the overhead will very much depend on how your application interacts with your data. At one extreme, if you have a set of static data that is small enough to be held in memory and is queried regularly, then there should be no overhead. At the other end of the spectrum, if you have an application that writes to the database a lot and reads less often, then the overhead will be higher.

How to Estimate the Performance Impact for Your Server?

Performance impact for your queries is going to be very much on a case-by-case basis, but in reality, it’s generally likely to be quite small. The reason for that is that, as discussed, we’re only going to see extra CPU requirements when our query needs to access the disk. Reading from and writing to disk is itself an activity that takes time, and even with the fastest disks, encryption/decryption is likely to take no longer than the disk access time. The encryption activities can usually be carried out in parallel to the disk activity, so you don’t see much increased time to read or write data. We’ll see an example of that shortly when we look at how you can get an idea of likely overhead on your server.

In terms of estimating overall overhead on your server, you need to understand the level of IO on the server as well as how well encryption will perform on the box.

Let’s work through an exercise to get an idea of the sort of numbers we might be talking about. For this, we’re going to need the database we created in How Long Will it Take to Encrypt Your Existing Data with TDE? that has about 10GB of data in a single table. We’ll also need a database that has the same set of data but without encryption turned on so we can get comparison figures. You can create that using the same scripts – just don’t run the final step of turning encryption on. We’ll call that database TestTDEOff.

We’re first going to run a query that will force SQL Server to read all the data in a table. We’ll repeat that across four scenarios:

  • TDE-protected database where the buffer cache is empty, so all data has to be read from disk
  • TDE-protected database where all the data for the table is in the buffer cache, so no data has to be read from disk
  • Database without TDE where the buffer cache is empty, so all data has to be read from disk
  • Database without TDE where all the data for the table is in the buffer cache, so no data has to be read from disk

Here is our query:

DBCC DROPCLEANBUFFERS;
SET STATISTICS IO, TIME ON;
SELECT *
FROM dbo.SomeData
WHERE Id = 100000000;
SELECT *
FROM dbo.SomeData
WHERE Id = 100000000;

The DBCC DROPCLEANBUFFERS command flushes all data held in the buffer cache. You won’t want to do this on a live system as it will affect performance, but if you have a server with similar hardware, you can run this to get an idea of how encryption performs.

The test runs the same select statement twice, once with no data loaded into memory and a second time once data has been loaded by the first run. We use the SET STATISTICS command to output information about performance to the messages tab in SSMS. The table we are querying from is a heap and has no indexes defined, so SQL Server has no option but to scan the whole table in order to generate the result set.

Let’s look at an extract of the information outputted by STATISICS IO, TIME to see what we’re getting. This is for the database with TDE enabled:

Table 'SomeData'. Scan count 13, logical reads 1204820, physical reads 0, page server reads 0, read-ahead reads 1203777, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

SQL Server Execution Times:

CPU time = 10046 ms, elapsed time = 5580 ms.

Table 'SomeData'. Scan count 13, logical reads 1204820, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

SQL Server Execution Times:

CPU time = 12407 ms, elapsed time = 1050 ms.

We have two sets of output here: one for the first run where there was no data loaded into memory and one for the second once the data was loaded. The key difference is that in the first we have a large number of “read-ahead” reads, which are where data is read from disk. Read-ahead refers to the fact that they are read in parallel with the processing, rather than all needing to be read before the CPU can get to work. In the second output we only have “logical” reads where data is read from memory.

You get a reasonable amount of variance in the CPU and elapsed times when running such tests, so I executed the query five times against each database, averaged the results, and rounded off to the nearest 100ms. The figures are shown below.

 CPU time (ms)Elapsed time (ms)
TDE with all data read from disk10,6005,600
No TDE with all data read from disk7,6005,700
TDE with all data read from memory12,4001,100
No TDE with all data read from memory12,2001,100

Due to the variance between test runs, we’ll ignore small differences. There are a few key takeaways:

  • The elapsed time was about the same with and without TDE.
  • The CPU consumption was about the same where data was read from memory.
  • When reading from disk, there was a higher CPU consumption when TDE was enabled.

That is about what we would expect; TDE only adds overhead when reading or writing to disk. When we were reading from disk, my disk was being accessed at full speed and the disk access time was the bottleneck, so the decryption required by TDE was easily able to complete while that was occurring.

In terms of what this shows regarding the performance impact on a production server, there are a few ways you can think about the data.

The scariest way of looking at it – and not necessarily the correct one – is to focus on the fact that when reading from disk TDE added about 3 seconds of CPU. That was about a 40% increase. The reason that’s not going to be the impact you see in live though is that (hopefully) most of the data accessed by SQL Server is already going to be sitting in memory and so will not be affected. Still, I might envision that I’m going to see between a zero and 40% impact.

Another way to look at it is to realize that my disk was maxed out during this test, loading about 2GB of data per second (actually slightly less but we’ll stick with round numbers). I can calculate that during every second of execution about an extra half a second of CPU, power was consumed by the TDE decryption. That equates to half a CPU core being busy. My machine has 12 cores so that’s about 4% of physical CPU overhead added. Running the Windows Performance Monitor (perfmon) during the tests, I can see that is about right. If I only had four cores in this box, I’d be using the same half a core, so that would be about 12.5%. It couldn’t go any higher though because my disks are already maxed out. I’d have to be able to physically read data from disk quicker in order to create more TDE overhead. On this box, I can see that decrypting data with TDE costs me about 0.3 seconds of CPU per GB of data.

Where this leaves us is that the best way to get an idea on TDE impact on a particular server is to look at the level of disk access and the number of CPU cores. You can look at how much disk access you have (to the relevant database data and log files only) at peak times and get an idea from there. Hopefully you can see that to add a significant percentage of CPU overhead with TDE, you’re going to need to be reading a huge amount of data from disk, have superfast disks, and not have a lot of CPU power.

If you are able to run through something similar to this exercise to benchmark a production system – and you come up with numbers that worry you – I’m going to suggest that you might have an excessive amount of disk access, and that might be indicative of a problem. In particular, you may want to consider if the server would benefit from having more memory so that data can be held in the buffer cache for longer – and not read from disk so often. Also, are queries being forced to scan whole tables due to a lack of adequate indexes – or because they are poorly written.

This post is part of a comprehensive series on SQL Server Encryption. Subscribe to my blog for updates as new posts are published or you can buy my book through the link in the sidebar to get it all in one go.

And if you’re embarking on an encryption project (or anything else to do with SQL Server) and want some help, I’m available for consulting. Please get in touch to discuss what I can do for you, or check out my services page to see what kind of things I offer.

What is calling my T-SQL scalar function (millions of times)?

In this post we look at a method using Extended Events (XE) to identify what parent objects are calling a given SQL function and how often.

The background is that I was working with a team where we identified that a certain scalar function was being executed billions of time a day and – although lightweight for a single execution – overall it was consuming significant CPU on the server. We discussed a way of improving things but it required changing the code that called it. The problem was that the function was used in about 700 different places across the database code – both in stored procedures and views – though the views themselves would then be referenced by other stored procedures. Rather than update all the code they’d like to target the objects first that execute the function the most times.

The method I show here will identify the root calling object in each case and calculate the number of executions of the function that are instigated. It comes with a few caveats, I have to capture an event for all calls made on the database – as I don’t know which ones will result in a call to the function – that means there’s likely to be CPU overhead and if you have a large amount of calls occurring there could be a fair amount of disk consumption. In my test the Extended Events (XE) session consumed about 130MB of disk space for a million calls captured. In my case I’m talking billions of calls a day so that will be hundreds of GB. As such I plan to only run it for a small interval, starting with a minute to see what I get and will prefer to run it in a test environment with a representative load.

The method is going to use the module_start extended event and the causality tracking feature that allows you to track how events are related to each other. Let’s go ahead and create the session. First I click on Extended Events Sessions in SSMS then right-click and select “New Session”.

In the first page that comes up I just give the session a name and select to turn Causality Tracking on:

Then I click on events and select the event I want. In this case it is just the module_start event and I’m not applying any filters:

Final I configure my data storage. For this example I’ve just added a ring_buffer (memory) and an event_file allowing up to 10G of storage. When I run this for real I may need to allow quite a bit more storage.

Now we’re done and I can just click OK.

Then I can go the the XE session in SSMS, right-click and select “Start Session”.

In my test database I’m now going to run a few stored procedures that execute my function a number of times. The idea is that I’ll then be able to use the XE data to work out which stored procedure called it the most and that therefore (in a real scenario) I might want to focus on fixing first.

Having run my tests I can then stop the XE session and export the results to a table. I show the method for that in a previous post:

https://matthewmcgiffen.com/2018/08/08/exporting-extended-events-session-data-to-a-table/

I’ve exported the results into a new table called FunctionExecution in the same database. I show the structure of the table you get here:

The columns I’m interested in for this analysis are object_name (the name of the code object), attach_activity_id.guid (this comes from the causality tracking and will be the same for all objects in the same call stack) and attach_activity_id.seq (which shows the order in which objects were called).

In order to make the analysis quicker I also want to add couple of indexes. The object_name column is an nvarchar(max) which won’t support indexing, so first I create a new column with a better datatype and copy the values over:

ALTER TABLE dbo.FunctionExecution ADD ObjectName sysname;
UPDATE dbo.FunctionExecution SET ObjectName = CAST([object_name] AS sysname);

Then I create my indexes:

CREATE INDEX IX_ObjectName ON dbo.FunctionExecution(ObjectName) INCLUDE ([attach_activity_id.guid]);
CREATE INDEX IX_GUID ON dbo.FunctionExecution([attach_activity_id.guid]) INCLUDE (ObjectName,[attach_activity_id.seq]);

Finally I am ready to run a query to get the results. I search for the name of the function I am interested in, use attach_activity_id.guid to find other calls in the same call stack, then use attach_activity_id.seq to find the top parent (with a value of 1):

SELECT parent.ObjectName, COUNT(*) AS NumberOfExecutionsOfFunction
FROM dbo.FunctionExecution func
INNER JOIN dbo.FunctionExecution parent
ON func.[attach_activity_id.guid] = parent. [attach_activity_id.guid]
WHERE func.ObjectName = 'NonInlinable'
AND parent.[attach_activity_id.seq] = 1
GROUP BY parent.ObjectName;

Here are the results against the test procedures I executed while the XE session was running:

For this example we can see that I have stored procedure called ProcedureCallingFunctionALotOfTimes that is responsible for most of the calls to my function – so in theory I’d focus my development efforts there first.

Use this method with caution though, as mentioned at the start there may be a lot of overhead in capturing all these events of your server – be it on your own head if that causes problems!

Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

Identify Unused Indexes across all Databases

I had a request regarding how to identify unused indexes across all databases on a SQL Server instance (rather than just the current one). I’ve written this script before so thought I’d post it up here on my blog for the future use of myself and others.

The script uses the undocumented sp_MSforeachdb internal stored procedure to iterate through all the databases (excluding the system ones) and store the results of a standard unused indexes script into a temp table. The table can then be queried to analyse the results.

This script comes with a caveat, that the sp_MSforeachdb procedure can be unreliable and on occasion skip databases so you should check that the results include all the databases you are interested in. A simple count of distinct database names, and comparing that against the number of databases on your instance should be sufficient.

If you are trying to analyse in the case where you have multiple instances of the same application database, e.g. one for each client, you can query the temp table to identify indexes that are unused for all clients.

Anyway here’s the script:

--Find unused indexes across all user databases
IF OBJECT_ID('tempdb..#UnusedIndexes') IS NOT NULL
DROP TABLE #UnusedIndexes;
SELECT TOP 0
DB_NAME() AS DatabaseName,
s.name AS SchemaName,
o.name AS TableName,
i.name AS IndexName,
iu.user_updates as IndexUpdates,
iu.user_lookups as UserLookups,
iu.user_seeks AS UserSeeks,
iu.user_scans as UserScans
INTO #UnusedIndexes
FROM sys.dm_db_index_usage_stats iu
INNER JOIN sys.objects o
ON iu.object_id = o.object_id
INNER JOIN sys.schemas s
ON o.schema_id = s.schema_id
INNER JOIN sys.indexes i
ON o.object_id = i.object_id
AND i.index_id = iu.index_id
WHERE i.is_primary_key = 0
AND i.is_unique = 0
AND iu.user_lookups = 0
AND iu.user_scans = 0
AND iu.user_seeks = 0;
EXEC sp_MSforeachdb '
USE [?]
IF DB_ID() > 4
BEGIN
INSERT INTO #UnusedIndexes
SELECT
DB_NAME() AS DatabaseName,
s.name AS SchemaName,
o.name AS TableName,
i.name AS IndexName,
iu.user_updates as IndexUpdates,
iu.user_lookups as UserLookups,
iu.user_seeks AS UserSeeks,
iu.user_scans as UserScans
FROM sys.dm_db_index_usage_stats iu
INNER JOIN sys.objects o
ON iu.object_id = o.object_id
INNER JOIN sys.schemas s
ON o.schema_id = s.schema_id
INNER JOIN sys.indexes i
ON o.object_id = i.object_id
AND i.index_id = iu.index_id
WHERE iu.database_id = DB_ID()

AND i.is_primary_key = 0
AND i.is_unique = 0
AND iu.user_lookups = 0
AND iu.user_scans = 0
AND iu.user_seeks = 0;
END'
SELECT * FROM #UnusedIndexes;
--DROP #UnusedIndexes;

Note: Partially in response to Jeff’s comment below. Proceed with caution before dropping any indexes the script identifies. Stats are only kelp since the last restart, so if that was recent then you may have required indexes that get listed. Also think about whether you might have any occasional processes, such as month, quarter, or year-end processing that will not have executed since that last restart and may require indexes.

Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

Query Store Hints in SQL Server 2022

Another neat little feature in SQL Server 2022 is Query Store Hints. This is the ability to apply a query hint through Query Store rather than having to modify existing code or fiddle around with plan guides.


Query hints are a way to influence optimizer behaviour towards generating desired execution plans for a given query. The word “hint” is a bit of a misnomer as usually they mandate what you wish to happen.


I’m not a big fan of leaping straight away to use query hints as they force the behaviour rather than letting the optimizer work out the best plan for your query. This means you may implement hints that you see as desirable for the current issue you are trying to fix, but they create problems for different executions of the query, or if the data profile changes. In general it’s better to see if you can rewrite the SQL code to achieve what you want, but sometimes that’s not possible and a hint ends up being the best way to go – or at least a good short term solution.


The great thing about applying hints through query store is that it can be done without altering application code. That means the DBA can apply a temporary hint while waiting for code to be fixed – or in extremis use them against third party applications where there is not the ability to get code changed.


I first read about them in Bob Ward’s book SQL Server 2022 Revealed. You can also read about them in Grant Fritchey’s excellent (and long!) book SQL Server 2022 Query Performance Tuning.


Applying a hint is simple as long as the query has a plan stored in query store. First you need to obtain the query_id for the query. One method for doing that is to use the following query I got from this stackexchange post:


SELECT
qsq.query_id,
qsq.last_execution_time,
qsqt.query_sql_text
FROM sys.query_store_query qsq
INNER JOIN sys.query_store_query_text qsqt
ON qsq.query_text_id = qsqt.query_text_id
WHERE
qsqt.query_sql_text LIKE '%your query text%';

Armed with the query_id, you can then apply the hint using the sys.sp_query_store_set_hints stored procedure. For example, if my query_id was 100 and I want to apply the RECOMPILE hint then I could do that with this SQL:


EXEC sys.sp_query_store_set_hints 100, N'OPTION(RECOMPILE)';

You can also view which queries have hints applied using the sys.query_store_hints view.


Finally if you want to clear a hint you have previously applied (such as the one above), you can do that as follows:


EXEC sp_query_store_clear_hints @query_id = 100;

Query store does not support all query hints. You see the full list of those supported (and not) at:

https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sys-sp-query-store-set-hints-transact-sql?view=sql-server-ver16

And that’s it – it’s that simple. Thanks for reading!

Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

Optimized Plan Forcing in SQL Server 2022

I’ve been reading Bob Ward’s book “SQL Server 2022 Revealed” and one of the new features in Intelligent Query Processing (IQP) jumped out at me. Not because it’s going to rock my world but rather it reminded me of an idea I had when I first started playing with Query Store when it came out.

I thought it would be cool if the execution plans cached in Query Store could be used to populate the plan cache after an instance restart, removing the need for all queries to be recompiled. I even raised a suggestion on Microsoft Connect for it to be considered as a new feature.

I won’t claim that anyone at Microsoft saw my post and thought “What a great idea!”, but by coincidence they have implemented in SQL 2022 something similar that works in specific scenarios – Optimized Plan Forcing.

The underlying idea behind it is to store – in query store – something called a compilation script that can be used to shorten significantly the compile time for a given query. Any query that runs when query store is enabled will get a compilation script stored if it has a significant compile time (the actual threshold is not disclosed). We can see can see which queries have compilation scripts by looking at the has_compile_replay_script in the sys.query_store_plan view.

If the query plan is forced in Query Store then the next time it needs to be compiled the optimization script will be used to build the plan instead of going through the usual query optimizer , greatly reducing the compile time.

Of course this only works for forced plans, but that makes sense as for other queries we often want a recompile to come up with a new and hopefully better plan. Still it’s an interesting feature. Bob talks about how it was designed in response to a pattern of performance problems they were calling a compile storm. This would be where the plan cache was emptied and then a lot of queries would have to be compiled concurrently causing CPU spikes. By shortening the compile time for some queries – and of course we DBAs can use Query Store to force as many plans as we like (with caution) it’s hoped to reduce the occurrence of such events.

I’m not going to go into trying to demonstrate this in action, but Bob has a full demo in his book which you can buy here:

The feature is turned on by default for databases on SQL Server 2022 as long as they have query store enabled – even if they’re not on the latest compatibility level. You can however turn it off for a given database using:

ALTER DATABASE SCOPED CONFIGURATION SET OPTIMIZED_PLAN_FORCING = ON;

You can also turn it off at the level of an individual query. Either using the query hint DISABLE_OPTIMIZED_PLAN_FORCING, or if you force the plan using the sp.query_store_force_plan system stored procedure then you can disable it using the @disable_optimized_plan_forcing parameter.

Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.