Microsoft states that enabling TDE usually has a performance overhead of 2–4%. That doesn’t sound like very much, and personally I wouldn’t let it bother me if I want to make sure my data is encrypted at rest. However, you may have heard other sources saying that it’s actually a lot more than that – and the performance impact is a high price to pay for the level of protection offered. So, what’s the truth?
Where Do We See an Overhead?
When we talk about performance, we are likely to be concerned about two things. One is the impact on query performance. Are my queries going to execute slower with TDE enabled? The other is what overall pressure is going to be added to the server.
The important point to start with is in understanding where and how TDE adds overhead. Encryption occurs as data is written to disk, and decryption occurs as data is read from disk. Each of those activities uses CPU. So, the CPU overhead added by TDE is going to be in proportion to your disk activity. If you have a system that is heavy on IO, then there is going to be more CPU overhead.
SQL Server tries to keep data that is referenced repeatedly in memory (the buffer pool). So, if your SQL instance is provisioned with enough memory, a lot of your read queries can access the buffer pool and don’t have to go out to disk. Such queries should not be affected performance-wise by TDE. There may be other read queries however that access older data that hasn’t been read for a while, and these queries would need to retrieve that data from disk and so there would be an overhead from TDE.
Any queries that modify data will need the outcome to be written to disk, so in these cases we will see an overhead. This overhead is likely to come in two parts: first when the transaction is written to the log file before committing and then later as the updated data gets written to the data file as part of a checkpoint operation.
We also have other operations that write or update encrypted data on disk, so we would also expect these to have some overhead. This would include operations such as index rebuild operations.
You can see from this that the overhead will very much depend on how your application interacts with your data. At one extreme, if you have a set of static data that is small enough to be held in memory and is queried regularly, then there should be no overhead. At the other end of the spectrum, if you have an application that writes to the database a lot and reads less often, then the overhead will be higher.
How to Estimate the Performance Impact for Your Server?
Performance impact for your queries is going to be very much on a case-by-case basis, but in reality, it’s generally likely to be quite small. The reason for that is that, as discussed, we’re only going to see extra CPU requirements when our query needs to access the disk. Reading from and writing to disk is itself an activity that takes time, and even with the fastest disks, encryption/decryption is likely to take no longer than the disk access time. The encryption activities can usually be carried out in parallel to the disk activity, so you don’t see much increased time to read or write data. We’ll see an example of that shortly when we look at how you can get an idea of likely overhead on your server.
In terms of estimating overall overhead on your server, you need to understand the level of IO on the server as well as how well encryption will perform on the box.
Let’s work through an exercise to get an idea of the sort of numbers we might be talking about. For this, we’re going to need the database we created in How Long Will it Take to Encrypt Your Existing Data with TDE? that has about 10GB of data in a single table. We’ll also need a database that has the same set of data but without encryption turned on so we can get comparison figures. You can create that using the same scripts – just don’t run the final step of turning encryption on. We’ll call that database TestTDEOff.
We’re first going to run a query that will force SQL Server to read all the data in a table. We’ll repeat that across four scenarios:
- TDE-protected database where the buffer cache is empty, so all data has to be read from disk
- TDE-protected database where all the data for the table is in the buffer cache, so no data has to be read from disk
- Database without TDE where the buffer cache is empty, so all data has to be read from disk
- Database without TDE where all the data for the table is in the buffer cache, so no data has to be read from disk
Here is our query:
SET STATISTICS IO, TIME ON;
WHERE Id = 100000000;
WHERE Id = 100000000;
The DBCC DROPCLEANBUFFERS command flushes all data held in the buffer cache. You won’t want to do this on a live system as it will affect performance, but if you have a server with similar hardware, you can run this to get an idea of how encryption performs.
The test runs the same select statement twice, once with no data loaded into memory and a second time once data has been loaded by the first run. We use the SET STATISTICS command to output information about performance to the messages tab in SSMS. The table we are querying from is a heap and has no indexes defined, so SQL Server has no option but to scan the whole table in order to generate the result set.
Let’s look at an extract of the information outputted by STATISICS IO, TIME to see what we’re getting. This is for the database with TDE enabled:
Table 'SomeData'. Scan count 13, logical reads 1204820, physical reads 0, page server reads 0, read-ahead reads 1203777, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 10046 ms, elapsed time = 5580 ms.
Table 'SomeData'. Scan count 13, logical reads 1204820, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 12407 ms, elapsed time = 1050 ms.
We have two sets of output here: one for the first run where there was no data loaded into memory and one for the second once the data was loaded. The key difference is that in the first we have a large number of “read-ahead” reads, which are where data is read from disk. Read-ahead refers to the fact that they are read in parallel with the processing, rather than all needing to be read before the CPU can get to work. In the second output we only have “logical” reads where data is read from memory.
You get a reasonable amount of variance in the CPU and elapsed times when running such tests, so I executed the query five times against each database, averaged the results, and rounded off to the nearest 100ms. The figures are shown below.
|CPU time (ms)||Elapsed time (ms)|
|TDE with all data read from disk||10,600||5,600|
|No TDE with all data read from disk||7,600||5,700|
|TDE with all data read from memory||12,400||1,100|
|No TDE with all data read from memory||12,200||1,100|
Due to the variance between test runs, we’ll ignore small differences. There are a few key takeaways:
- The elapsed time was about the same with and without TDE.
- The CPU consumption was about the same where data was read from memory.
- When reading from disk, there was a higher CPU consumption when TDE was enabled.
That is about what we would expect; TDE only adds overhead when reading or writing to disk. When we were reading from disk, my disk was being accessed at full speed and the disk access time was the bottleneck, so the decryption required by TDE was easily able to complete while that was occurring.
In terms of what this shows regarding the performance impact on a production server, there are a few ways you can think about the data.
The scariest way of looking at it – and not necessarily the correct one – is to focus on the fact that when reading from disk TDE added about 3 seconds of CPU. That was about a 40% increase. The reason that’s not going to be the impact you see in live though is that (hopefully) most of the data accessed by SQL Server is already going to be sitting in memory and so will not be affected. Still, I might envision that I’m going to see between a zero and 40% impact.
Another way to look at it is to realize that my disk was maxed out during this test, loading about 2GB of data per second (actually slightly less but we’ll stick with round numbers). I can calculate that during every second of execution about an extra half a second of CPU, power was consumed by the TDE decryption. That equates to half a CPU core being busy. My machine has 12 cores so that’s about 4% of physical CPU overhead added. Running the Windows Performance Monitor (perfmon) during the tests, I can see that is about right. If I only had four cores in this box, I’d be using the same half a core, so that would be about 12.5%. It couldn’t go any higher though because my disks are already maxed out. I’d have to be able to physically read data from disk quicker in order to create more TDE overhead. On this box, I can see that decrypting data with TDE costs me about 0.3 seconds of CPU per GB of data.
Where this leaves us is that the best way to get an idea on TDE impact on a particular server is to look at the level of disk access and the number of CPU cores. You can look at how much disk access you have (to the relevant database data and log files only) at peak times and get an idea from there. Hopefully you can see that to add a significant percentage of CPU overhead with TDE, you’re going to need to be reading a huge amount of data from disk, have superfast disks, and not have a lot of CPU power.
If you are able to run through something similar to this exercise to benchmark a production system – and you come up with numbers that worry you – I’m going to suggest that you might have an excessive amount of disk access, and that might be indicative of a problem. In particular, you may want to consider if the server would benefit from having more memory so that data can be held in the buffer cache for longer – and not read from disk so often. Also, are queries being forced to scan whole tables due to a lack of adequate indexes – or because they are poorly written.
This post is part of a comprehensive series on SQL Server Encryption. Subscribe to my blog for updates as new posts are published or you can buy my book through the link in the sidebar to get it all in one go.
And if you’re embarking on an encryption project (or anything else to do with SQL Server) and want some help, I’m available for consulting. Please get in touch to discuss what I can do for you, or check out my services page to see what kind of things I offer.