Recently, Microsoft quietly let us know that TDE (Transparent Data Encryption) will be available in the Standard Edition of SQL Server 2019. If you don’t follow SQL topics on Twitter then it would have been easy to have missed that.
Transparent Data Encryption is the ability to have all your data stored encrypted on disk – otherwise known as encryption at rest. This is data files, log files and backups. TDE allows this without you having to change anything in your applications or code (thus the transparent part).
This is big news. Given the upsurge in awareness of data protection – largely driven by GDPR and other data regulation – encryption is a hot topic.
Management in most organisations tend to be keen to have encryption at rest – especially if it doesn’t cost much, and is easy to implement. It doesn’t always achieve as much as people might think – but it’s still very much a desirable feature.
For us as SQL Server DBAs I can see this change meaning a couple of things. The first is that there’s likely to be a push to upgrade to SQL Server 2019 much sooner than we might have seen with previous versions. It’s often having a “killer feature” that pushes upgrades. For SQL 2019 – if you run on Standard Edition – I believe this is that killer feature.
The other thing it means is that if you haven’t worked with TDE before, you’re going to want to get knowledgeable about it. It’s very easy to set up, but you’re going to want to understand the potential pitfalls, issues around managing it, how it’s going to affect performance. What exactly it’s protecting you from, and additional steps (that may not be obvious) you should take to make it more secure.
If you understand these things before you have to think about implementing TDE then you’ll be able to plan a smooth implementation – and be able to advise management from a strong knowledge base.
I tried to cover all these topics on this blog previously, so here’s some links. I also have some new stuff coming up about TDE.
If you’ve read about the Accelerated Database Recovery feature in SQL Server 2019 you could be forgiven for thinking it’s just about speeding up database recovery time in case of a server failure.
In fact, enabling it also means that where you have a long running transaction that fails or is cancelled the rollback is almost instantaneous. This is great news for DBAs who have to sometimes kill a long-running blocking transaction but worry that it may take a long time to rollback – continuing to block all that time.
This is achieved by the fact that Accelerated Database Recovery maintains a version store in the database, and where a row is updated, the old version of the row is kept until after the transaction is complete. That makes it quick and easy to revert to the old version in case of failure.
Let’s look at a quick example.
I have a table with about 10 million rows – all containing the same text value:
CREATE DATABASE TestADR;
USE TestADR;
CREATE TABLE dbo.TestADR(Id int IDENTITY, SomeText varchar(50));
INSERT INTO dbo.TestADR (SomeText)
SELECT TOP 10000000 'FrangipanDeluxe'
FROM sys.objects a, sys.objects b, sys.objects c, sys.objects d;
I update all the rows in the table to a new value:
UPDATE dbo.TestADR SET SomeText = 'FrangipanDeluxA';
This took about a minute.
I then execute a query to change them back and cancel the query in SSMS after about 30 seconds.
UPDATE dbo.TestADR SET SomeText = 'FrangipanDeluxe';
It took about 30 seconds more to cancel – which is SQL rolling back the changes.
Then I enabled Accelerated Database Recovery, you do this at the database level:
ALTER DATABASE TestADR
SET ACCELERATED_DATABASE_RECOVERY = ON;
Now I re-run that last update, again cancelling after 30 seconds.
This time the cancel was instantaneous, it took SQL no noticeable amount of time to roll back the changes.
This is great but we’ll probably want to be careful before we enable it on all our databases – when we get them onto SQL 2019 anyway. There will be an additional overhead in managing the version store and that could have an impact in terms of time taken to complete write queries, as well as storage requirements.
Still, it seems like a good feature – something to look forward to playing with more.
This is something I touched on back in 2017 a little after the Live Query Statistics feature was introduced with SQL 2016, but I was using the functionality this morning and felt like it was worth a reminder.
You can use Live Query Stats to check on the progress of an executing query – and you can do it through the GUI in SSMS.
I created a long running query for a demo, and after 15 minutes I was still waiting for it to finish. That was a bit longer than I intended. Should I kill it – or did I just need to wait a few more minutes for it to complete.
You can check this quickly via the Activity Monitor:
Find the query you are interested in in the processes list:
Right-click and select “Show Live Execution Plan”. That will show you something like this:
I can see from this that my query is about 83% complete, so maybe I’ll just wait a little longer. Note that this is a live view, so the numbers keep updating. If I want I can watch the progress.
This is against a SQL 2019 instance and is the out of the box behaviour. Before SQL 2019 you had to enable trace flag 7412 if you wanted this to work:
I think it’s appropriate to give a shout-out to Microsoft at this point, because over the last few releases they’ve given us some of the items that are top of my list.
Recommending and setting MAXDOP during setup (coming with SQL 2019) will hopefully mean I no longer have to have arguments about why the out-of the-box setting isn’t appropriate.
The same with setting max memory in the setup (also with SQL 2019).
A more verbose error where string or binary data might be truncated – we got that in SQL 2017.
It’s the little things like these that make me happy – and make my job easier.
A few other little things I’d like – though I accept that something small to describe isn’t always small in its execution…
A change to the default cost threshold for parallelism – be it 20, be it 30, be it 50. I’d prefer any of those to 5.
I think it would also be great to have a cardinality optimizer hint, e.g. OPTIMIZE FOR 5 ROWS. Oracle has this, and it’s not good having to feel jealous of those working on the dark side 😉 You can do the equivalent but it’s convoluted and not clear what’s going on when the uninitiated see the code:
There is one big thing I’d like – but it’s never going to happen. Get rid of Enterprise Edition – or rather, make Enterprise Edition the standard. Enterprise is comparatively so expensive, it’s rare I’m going to recommend it. It’s interesting to see that in Azure SQLDB we only have one edition to work with – I’d love to see that in the box product. I understand that change would be a massive revenue loss so can’t see it happening.
If not that though, if we could just have at-rest encryption (i.e. TDE) in the Standard Edition that would make me very happy. In these days of security and privacy consciousness it seems that should be a core functionality.
UPDATE: TDE is going to be available on Standard Edition from SQL Server 2019. I get my wish!
Finally, I’d just like to upvote Brent’s idea that it would be great to be able to restore just a single table.
That all said, I’d like to go back to my first point. I love what MS is doing with SQL Server, and the continual improvements that are being made. I particularly love that we sometimes get them for free in service packs and cumulative updates – without having to upgrade to a new version.
This wasn’t new information when I wrote it, but bad performance due to the use of table variables remained such a common anti-pattern that I thought it was worth stressing again.
So, when I saw the above 2019 feature I thought I’d better investigate and update what I’m telling people.
TL;DR It looks like table variables are no longer a problem.
I’m going to re-use my examples from the previous post (as well as some of the images captured). One thing to note though it that I’m running these tests on a different, much less performant box than I did last time, so execution times will be longer.
My first test last time was just to show how cardinality estimation was better for a temp table vs. a table variable. This time I’m just going to compare a table variable running under compatibility mode 140 (SQL 2017) with one running under compatibility mode 150 (SQL 2019).
I set the compatibility mode as follows:
ALTER DATABASE AdventureWorks2012 SET COMPATIBILITY_LEVEL = 140;
Then I run the first query:
DECLARE @a TABLE(i INT);
--Create a million rows
WITH Nums(i) AS
(
SELECT 1
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n(i)
)
INSERT INTO @a(i)
SELECT n1.i
FROM Nums n1
CROSS JOIN Nums n2
CROSS JOIN Nums n3
CROSS JOIN Nums n4
CROSS JOIN Nums n5
CROSS JOIN Nums n6;
SELECT i FROM @a;
The execution plan for that final select looks like this:
And when I hover over the Table Scan operator I can see the properties:
You can see that the estimated number of rows is calculated as 1 – but the actual number of rows was 1 million. This is (was) the problem with table variables.
Now I’ll change the compatibility level:
ALTER DATABASE AdventureWorks2012 SET COMPATIBILITY_LEVEL = 150;
And run the query again.
I get the same execution plan, so let’s just look at the properties of the Table Scan operator this time:
Here you can see that the estimation is correct, 1 million. This is looking good so far for the future of table variables.
In the last post I then looked at a case where the bad estimation caused a poor selection of execution plan so let’s repeat that test. First, I set my compatibility level back to 140, then I run the following (if you want to repeat, you need the AdventureWorks2012 database):
DECLARE @BusinessEntityId TABLE (BusinessEntityID INT NOT NULL PRIMARY KEY CLUSTERED);
INSERT INTO @BusinessEntityId
SELECT BusinessEntityID FROM Person.Person;
SET STATISTICS IOON;
SET STATISTICS TIMEON;
SELECT COUNT(*)
FROM @BusinessEntityId b
INNER JOIN Person.Person p
ON b.BusinessEntityID = p.BusinessEntityID;
SET STATISTICS IOOFF;
SET STATISTICS TIMEOFF;
Let’s look at the execution plan for the final query:
Now let’s look at the properties for the clustered index scan:
We can see that estimated number of rows is just 1, but that the actual number of rows is 19,972.
If we then look at the properties for the Index seek operator:
We can see that the estimated number of executions was 1, but the actual number of executions was 19,972. That’s 19,972 seeks into a table with 19,972 rows. As previously discussed – there’s got to be a better way. A Nested Loops join is generally best when there’s a small number of rows from the top table, and a larger amount of rows from the bottom. In this place they both have 19,972 so it’s not optimal.
Here’s the output of the statistics commands for that query:
Now we’ll change the compatibility level to 150 and try again. Here’s the execution plan we get this time:
You can see the join operator is now a Hash Match which is generally a much better choice for evenly sized tables. Let’s look at the properties of the Clustered Index Scan in this new plan:
This time the estimated and actual values match, which has enabled SQL to make the better choice of plan. We can see that if we look at the statistics output for this one:
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 25220 ms.
You can see the CPU and reads are massively reduced.
So, it seems table variables aren’t so bad anymore. At least they won’t be once you are running SQL Server 2019.
I should probably add some caveats at this point. This has been a limited test, so you yourselves should make sure you test your own performance if you start using table variables instead of temp tables. There are also going to be deferred compilations when you do this in stored procedures. I haven’t dug down into exactly how that will work with table variables, but I guess it’s going to be similar to the way temp tables work in existing SQL versions.