How Secure is Transparent Data Encryption (TDE) – and How to Prevent Hacking

TDE is commonly described as “at-rest” encryption, i.e. it protects your data wherever it is stored on disk. This includes the database files, any backups taken (including Log and Differential), and any data that may get temporarily persisted to TempDB (when you use TDE to encrypt any database on an instance TempDB will get automatically encrypted also).

TDE does not however give any additional protection against those accessing data by querying the database. If you have access to the database then TDE transparently allows that to continue without any change in the functionality.
Understanding that, if we are implementing TDE we can only expect it to fully protect us from people who have access to the file system but do not have access to the data stored through any other avenue – for instance by querying SQL Server.

In most scenarios we probably don’t plan on there being such a person, i.e. we don’t grant the world read access to our database servers and say “go on, feel free, have a play”. In most production systems access is tightly tied down, and if you work in an environment where this is not the case then you probably have things you should address before you start looking at implementing encryption.

In reality though, mistakes or oversights happen, or sometimes malicious parties are able to gain access they should not have. What TDE offers us is, like many other forms of encryption, an additional layer or protection. The first line of defence gets breached, but there is still the next one to get past.

There is also the scenario of database backups, where these might be stored off the database server and even offsite. TDE gives us some assurance that in these scenarios the data is protected irrespective of the access restrictions in force in those other places. We’ve all read of breaches which were enabled by a database backup being stored on a location that was easier to access than it should have been.

Of course this protection of the backups could simply be achieved by using backup encryption – which is a separate feature of SQL Server and does not have the performance overhead associated with TDE.

So, if we are specifically implementing TDE rather than just backups it might be that we are specifically trying to protect ourselves from someone gaining direct access to the database files. That could be seen as quite an edge case – but that doesn’t mean it isn’t worth protecting against.

One thing to remember with TDE is that it is not (on its own) going to protect you against someone who has Admin right either over the box where SQL Server is installed, or over SQL itself. A local Admin can add themselves as an administrator on SQL – though they will need to restart the instance to do so. And an administrator on SQL can do pretty much whatever they like, including exporting a copy of the certificate and key used by TDE so that they can use that to read stolen data.

Still, there is some level of protection gained from TDE and if you combine it with tight access controls, and auditing, then it can help you reduce the scenarios where your data is at risk.

Hacking TDE

There is a great post and accompanying video out there by Simon McAuliffe that shows a method that can be used to break TDE:

Simon’s not a big fan of TDE and he makes a good argument, the reality is that a lot of the time people are using it to tick a box “yes, all our data in encrypted at rest” and no-one questions the particular scenarios where it is – and is not – offering protection.

While I agree with Simon’s points, I don’t feel as strongly as he does. You just need to be making an informed decision to implement TDE based on a full understanding of what it can and cannot protect you from. It’s not a magic bullet, but it’s not totally useless either.

Nonetheless the method Simon outlines for breaking TDE is concerning as it only requires basic level read access over the server to be able to grab all the objects you need to crack a TDE protected database. Sure, you need to have a certain knowledge base to be able to do that, but protection that is based on an attacker not having the right skillset, is not of the greatest value.

I won’t go into Simon’s method in full detail – you can check out his post for that. But I’ll give you enough of a summary so you can hopefully understand the principles involved.

What he details is an inevitable vulnerability that he specifies would not be unique to TDE – but common to any similar technology.

It’s back to the “T” of TDE again, “Transparent”. The concept of TDE is that it is fully self-contained and managed in the background for you by SQL Server. It doesn’t require you to supply additional passwords or keys when querying data, and it doesn’t need to access any objects stored outside of the SQL instance.
Let’s look again at the diagram with the encryption key hierarchy:


Starting at the bottom, each key is protected by the one above it. But what happens when we get to the top? There is nothing left to protect the uppermost DPAPI keys, so they must be stored unencrypted somewhere on the system.

You simply cannot get around that, SQL needs to be able to – all on its own – encrypt and decrypt data. Even if we say we’ll encrypt the DPAPI keys, what will we encrypt them with and where will we store that object? You can’t get away from the fact that at some point we need to store an unencrypted key, or password, or whatever, somewhere.

As a consequence, it’s remarkable easy to pick these root DPAPI keys up, though you need a few basic hacker skills/tools to be able to do anything with them.

Once you have the DPAPI keys, you can use those tools – in combination with a copy of the database files (or backup) from the Master database to obtain the SMK and thus break encryption all the way back down the hierarchy.

So what has just happened? A skilled and motivated hacker with mere read permissions to your server was able to steal your files and decrypt your data.

Just the scenario you implemented TDE to prevent.

They keys are held in the following directory on most Windows systems:

And by default, as long as you have read access to the file system you can read them.

Preventing this Hacking of TDE

Fortunately there is a basic action you can take to mitigate this. Strangely, this step isn’t mentioned (as far as I know) in any of the other documentation about setting up TDE.

The solution is that when you are setting up TDE you should take an additional step. Very simply, you need to secure this directory so that only Local Administrators to the server, the LOCAL SYSTEM account, and the SQL Server service account can read anything within it.

Once that’s done TDE should be offering you what you expect, i.e. casual read access to the server will not allow people to read your data. Only administrators would be able to gain access to the data, and as we’ve discussed already – Admins can access your TDE protected data anyway.

The only downside of doing this is that you are restricting access to the DPAPI (Data Protection API) from any other accounts. If you are running a pure SQL server then this should not be a problem – i.e. the server is for an installation of SQL and nothing else. If you also have application services running on the same box though, and these services wish to use encryption, then you may need to extend the permissions assigned to that directory. Though a better choice would be to move these services off the box.

SQL Puzzle 5: Prime Magic

Quite a few of you have read or attempted the previous puzzle SQL Puzzle 1: Magic Squares

As a quick reminder, when you have a 3 x 3 grid, a magic square is one where each row, column or diagonal add up to the same value. e.g.


All the rows, all the columns and both diagonals each add up to 15.

In the previous puzzle I asked you to write SQL to find all the possible 3 x 3 magic squares. After originally coming across that idea I discovered all sorts of variants on the magic square theme.

This time, I’ll ask you to find write SQL to find just one 3 x 3 magic square, but in this case one composed only of prime numbers? They don’t have to be consecutive primes (though they do all have to be different), and to make things easier you can treat the number 1 as a prime if you want.

Have fun!

Previous Puzzles:
SQL Puzzle 1: Magic Squares
SQL Puzzle 2: Eight Queens
SQL Puzzle 3: Knights and Queens
SQL Puzzle 4: The Beale Papers

Understanding Keys and Certificates with Transparent Data Encryption (TDE)

I’ve been taking a bit of a deep dive into understanding Transparent Data Encryption (TDE). As part of that I’ve been reading a lot of blog posts, stack overflow answers and technical documentation to try and deepen my knowledge.

Within that I’ve found a lot of contradiction I’ve needed to overcome. In particular this has been around what objects you need to recover an encrypted database to another SQL Server – be that when you’re doing a straight restore, working with log shipping, or using Availability Groups.

Most of the solutions offered work, but many describe additional steps that are not necessary. I feel that part of the problem is that people are misunderstanding the basics about encryption keys in SQL Server, so I thought it would be worthwhile going over that in a bit of detail before digging deeper into TDE in general.

I think it’s important to understand this stuff clearly, because then you have a clear view of when you’re protected and when you are vulnerable. If we’re engaging in encryption then we clearly have a desire for security – to be sure of that we have to be clear in our understanding.

Keys in SQL usually have three components (and this is the same for the Column Encryption Keys in Always Encrypted that I spoke about previously):
Understanding Keys and Certificates with Always Encrypted

The Key itself – Usually can be thought of a number expressed in binary format. Long and random enough to make it difficult to guess even by brute force attempts.

Another object that’s used to protect the key – This object might be another key, it might be a certificate, or it might just be a password. This object is used the encrypt the key.

The encrypted value of the key – Formed from the original value of the key, encrypted by the protecting object.

In SQL we rarely (maybe even never) see the actual value of the key. We have the encrypted version and we usually know what object was used to encrypt it. That second object may even be another key that is itself encrypted by a third object.

When it comes down to it though, the actual thing that is used to encrypt or decrypt data is the Key itself, not the encrypted value, and not the hierarchy of objects that may have been used to protect it.

So, all I really need to read your data, is your key.

Let’s look at that in the context of TDE. Here’s a standard diagram from books online that shows the hierarchy of encryption for TDE. There are other ways of working with TDE but this is the standard:


So, right down at the bottom of the diagram is the Database Encryption Key (DEK). That is what is used to encrypt/decrypt the data in the database. It sits in the database itself. So when you backup the database the DEK is held in the backup.

BUT – the DEK is itself encrypted by the certificate that sits in the master database, so even if someone has your backup they can’t access the key – and nor can any system they try and restore your backup to. So they also can’t access your data.

To be able to decrypt the DEK, the certificate is required. In fact, what is really required is the private key associated with the certificate as that’s what’s used for encrypting stuff. The private key however is itself encrypted by the Database Master Key (DMK) that sits in the master database.

So you’ll be forgiven at this stage for thinking that in order to read our backup of the database, we need the backup (containing the DEK), the Certificate (include the private key) and the DMK.

When you realise that the DMK is itself encrypted by the Service Master Key(SMK), and that the SMK is also encrypted then you might think you need to include those too – and whatever encrypted the SMK.

Where will it end!?

In reality we just go back to our certificate and its associated private key.
Let’s say I have a certificate called MyTDECert. I can (and must) backup this up outside of the database. If this is lost, then so is my data. The command for backing it up looks something like this:

TO FILE = 'D:\Temp\MattTest\MyTDECert'  
    FILE = 'D:\Temp\MattTest\SQLPrivateKeyFile',  
    ENCRYPTION BY PASSWORD = 'C0rrecth0rserbatterystab1e'  

When I back this up I specify a password to encrypt the Private Key. Remember that the Private Key was already encrypted by the DMK? Well, this backup certificate command doesn’t just encrypt it a second time – what would be the point of that? No, the reason I need to supply a password is that the command retrieves the unencrypted version of the private key, re-encrypts it with the password INSTEAD and then that is what gets save to disk.

Remember that at the point I run the command SQL has access to all the objects, all the way up the chain, that are used for the encryption. So it has no problem getting the actual value for the Private Key.

Now, when I restore that certificate to – let’s say another instance of SQL Server)- the command looks like this:

FROM FILE = 'D:\Temp\MattTest\MyTDECert'  
    FILE = 'D:\Temp\MattTest\SQLPrivateKeyFile',  
    DECRYPTION BY PASSWORD = 'C0rrecth0rserbatterystab1e' 

Considering what the BACKUP command did, you can intuit that the above command will do the opposite. It first of all decrypts the Private Key using the password supplied. Then it encrypts it again using the DMK for the local instance before saving it in the master database locally.

Remember it is the Actual Values of Keys that get used for encryption – not their encrypted value. So the private key for this certificate is exactly the same as it was where we backed it up from – though the encrypted value will be different as it has used a different DMK.

So if I now go to restore a copy of a database whose DEK has been encrypted with this certificate onto the new instance I can do so without any problem.

Many articles will tell you that you also need to migrate the DMK from your old instance, and some will say that you also need the SMK. This is simply not correct – I mean, it will work if you do that (in the right order) but it isn’t necessary.

And in some scenarios it may simply not be possible. Imagine that the instance you are migrating to is already using encryption based on its current SMK and DMK, if you replace those with the ones from the instance you are migrating from then you are going to break existing stuff.

So all you need to restore a database protected by TDE is the database backup, the certificate/private key, and the password specified when the certificate and key were backup up.

Equally that’s all someone else needs too, so make sure those things are protected.

In my next post we’ll look at a possible way of hacking TDE and the additional steps you need to protect yourself to make sure TDE is giving you the level of protection you expect. When that post is live you should be able to see a link to it in the comments.

Overview of Encryption Tools in SQL Server

More and more people are considering some level of encryption against their data stored in SQL Server. In many cases it might be considered that other measures such as firewalls, well defined access permissions and application code free of security flaws, already offer sufficient protection.

Anyone who follows the tech news though will know that attacks and data breaches are common. Encryption often provides the second line of defence, i.e. even if someone malicious gains access to your data, they may not be able to read what they find there.

Over the last 10 years, the number of features available in SQL Server for working with encryption has grown. It can be confusing to understand what the differences are, which you should implement and what exactly each feature protects you from. In most cases these are complementary technologies and if you are getting serious about encryption you may choose to implement more than one. In the latest versions of the SQL Upgrade Advisor you will be recommended by default to consider Transparent Data Encryption and Always Encrypted, and it is (or should be) standard practice to be using TLS.

Often it seems to be the case that people implement some form of encryption to “tick a box”. If you are really serious about protecting your data then you really need to understand what each feature does – and does not – protect you against.

One thing that may influence your decision on what to use is what is available in the versions and editions of SQL Server that you have in production. Here’s a quick comparison, I’ve missed out versions where no new features were added and for 2016 I’ve specified SP1 as a lot of features changed from Enterprise only to being available in Standard – so there’s no good reason for not being on SP1 if you’re using 2016:


We’ll quickly run through the various technologies listed above with brief detail on each.

Column Encryption (2005)

Uses the Encryption Hierarchy and built-in Cryptographic Functions to allow you to encrypt values and store them in the database. Ability to read the encrypted data is based on permission or access to the objects used to perform the encryption e.g. a pass phrase, a certificate or an encryption key.

It can be easy to misunderstand what this feature is, it is not that you configure a column as being encrypted, but rather that the Cryptographic Functions combined with the Encryption Hierarchy in SQL Server allow you to encrypt specific values, which can then be stored in the database.

TDE (2008) Enterprise Only

TDE (Transparent Data Encryption) is configured at the database level and is defined as “at-rest” encryption. This means that the data and log files for your database (as well as backups) are encrypted on disk. One of the key advantages to TDE is that it doesn’t require any code change to implement.

TDE doesn’t protect you against anyone who is able to read data directly in the database, and an administrator on the SQL Server instance or the server hosting it can fully circumvent it. It simply protects you against parties who may gain access to the files.

Without the certificate and key used for TDE they would not be able to restore a backup to another server, and they would not be able to directly read data in the database files – which otherwise are in a relatively readable format for someone sufficiently motivated.

One thing to note with TDE is that it is only available in the Enterprise Editions of SQL Server.

Those of you who follow the major SQL Blogs may have come across this article which points out a seemingly big flaw in TDE that allows someone with minimal privileges against your server (read is sufficient) and the right skillset, to be able to extract your keys and read the data:

It seems that you can mitigate this by making sure access is restricted to the directories holding the relevant keys:

If only administrators (and the SQL Server service account) can access this directory then you should be safe.

TDE does add extra processing overhead to SQL. 3-10% is often quoted but that will vary with your workload so if this is a concern you should test – some people consider that the overhead is not worth the benefits.

Backup Encryption (2014)

Backup Encryption is what it sounds like, just your database backups are encrypted. All types of backup are supported – Full, Differential and Log.
This isn’t a setting you turn on for your database – you have to specify when you make a backup that you want it protected. As such this doesn’t protect you against someone making an ad-hoc backup and storing it on a USB device which they then leave on a train – or any other similar human error.

One nice thing is that you don’t have to do anything special when restoring the database back to its original home. Backup encryption (generally) uses a certificate or key stored in the master database so as long as you haven’t lost that, the encryption part of a restore process is seamless. If you need to restore elsewhere then you’ll need a copy of the certificate or key.

Backup encryption will suffer from the same limitations as TDE (except that it will only add CPU overhead when taking backups), however if your intent is to make sure backups stored off the database server are protected then this should be sufficient.

Always Encrypted (2016)

Always Encrypted (AE) is implemented at the column level. Data is stored encrypted on the disk, in memory and when being passed to a client application. Encryption is based on the combination of a key and certificate, where the certificate is usually stored on a client machine and not stored on the database server. This means that access (even full sysadmin access) to the database server on its own is not sufficient to be able to read the data.

Implemented correctly AE can be very secure, however it also has a number of logical limitations that mean it can be challenging to interact with the encrypted data. For instance you can’t use LIKE comparisons against encrypted columns.

How big an impact this is for you depends on how your application is coded and the sort of columns you want to encrypt. It could be straightforward and require no code change – or it could entail major rework.

Here are my previous posts on AE:
Understanding Keys and Certificates with Always Encrypted
Working with Data in Always Encrypted

Dynamic Data Masking (2016)

Dynamic Data Making is not actually encryption but is another way that people may choose to protect data from prying eyes.

This provides a method of hiding data from non-privileged users without having to change any code.

Image storing a credit card number. We might want to remember a customer’s card details so that it is easy for them to make payments in the future. Equally though we don’t want people to be able to easily view it. Often we’ll see that just the last 4 digits are displayed so that we have enough to verify it is the correct card. Dynamic Data Masking provides a way of doing this. We can define a mask, so that for most users who have permission to view the data the number:

Would only appear to them as:

We can then also define specific roles that can access the full number, for instance one assigned to a service account used for processing payments.

That sounds great, but in reality the protection is not that great. If I have read access to the database, but am in a non-privileged role then it would still be possible for me to get around the masking with a bit of querying. This is due to the requirement that existing queries should continue working, so the SQL engine can see the underlying data even if I can’t and will respond to me attempting comparisons against the data.

Transport Level Security (TLS)

TLS is a protocol used to protect data as it travels across a network. It is fairly equivalent to SSL (Secure Sockets Layer) which is used to protect data between a web server and a browser. Without TLS, data from SQL is sent unprotected across a network and can be intercepted by what is known as a Man-in-the-Middle attack.

Such attacks function by impersonating the parties involved. Imagine if we have the SQL Server instance and a client application. A communication starts from the client to the server, the attacker impersonates the server and intercepts the data or request sent from the client, it then can alter the message before passing it on to the server – which believes it is receiving it directly from the client. Then it will intercept any data being send back, either just to capture it, or to alter it, before transmitting that back to the client which believes it is in direct and private communication with the server.
TLS creates an encrypted connection and encrypts the data sent across the network and so the information passed back and forth cannot be read by any party other than the sender or receiver.

TLS also allows both the sender and receiver to verify each other so the Man-in-the-Middle attack is doubly foiled.

Hashing and Salting

Hashing is a method of taking a value (usually a string such as a password) and transforming it via a Hashing function to a new value that in theory cannot be reverse engineered to find the original value.

When someone enters a password into the system that value can be put through the same hashing function and the resulting hashed value be checked against the value stored in the database to verify that they match.

In practice is would be possible for someone to maintain a list of what the hashed values are for a particular hashing function and thus be able to recognise where common values have been used.

This where salting comes in, a second value known as the salt is generated and combined with the value to be hashed. Then the salt is stored unencrypted alongside the resulting hashed value. When we then want to verify (for instance) a password, we simply combine the entered value with the salt before hashing and comparing. If the salt is unique for each value we hash there’s no practical way of maintaining a dictionary of hashed values.

SQL has a series of hashing functions but HASHBYTES is the one most commonly used for this process.

That’s the set of tools available. The below table shows a quick comparison of what each protects you from and any disadvantages:


No solution you pick is going to be secure and safe just by implementing one tool or another. Of key importance is process. Encryption generally involves keys of one form or another and these must be protected. Your house may be secure if you have good locks, but not if you leave the keys outside on the doorstep.

Equally you need to think about what you are attempting to achieve by implementing a particular tool and make sure that the way your organisational roles and processes are structured supports this. For instance Always Encrypted is often touted as even being able to protect against a rogue DBA. The DBA may have full access and be able to do whatever they like to the SQL instance, but for AE to read protected data you also need a certificate that is stored on application servers. If you want this level of protection then you need to make sure that you have sufficient role separation such that the DBA can’t access the certificate, and those who are admins on the application server can’t access the database. Or at least not without triggering auditing.

Finally, remember that any system is only as strong as its weakest link. There’s no point encrypting your data if the application accessing it is so full of holes it will leak it all out anyway, and it’s often said that the weakest part of any system in the human. The point of having data is that we want to be able to access it, so there must always be a point where the data is viewed/or interacted with in an unencrypted form.

At the end of the day the only way of absolutely ensuring there is no way for someone to access data they shouldn’t – is to have no data in the first place.

But in practice, you can at least try to make it hard for them.

Avoiding confusion with SQLCMD variables

SQLCMD variables can be a useful way of having changeable parameters for your SQL scripts, allowing you to specify the value from a command line, but also to control things you couldn’t manage through a SQL variable.

They can also be a little confusing the first time you see them.

Here’s an example:

SELECT * FROM dbo.Test WHERE TextValue='$(MySQLCMDVar)';

If you just run this then you might get the error:
Msg 102, Level 15, State 1, Line 1
Incorrect syntax near ‘:’.

The important thing is to understand that when you see a colon at the beginning of a statement in your T-SQL script, then this is a SQLCMD statement, and to parse these in SSMS you need to have SQLCMD mode enabled. You can do this from the “Query” menu:
Now I execute my script again, and it runs fine – though it returns no results.
Let’s look at another quick example:

:SETVAR TableName "Test"
SELECT * FROM dbo.$(TableName) WHERE TextValue='Wibble';

Here I’ve used the SQLCMD variable to define the name of the table in my query. This is the power of SQLCMD variables, you couldn’t do this with a normal SQL variable – the only way to do that would be to use dynamic SQL instead.

Let’s understand a little deeper what they are and how they work.

First of all, they don’t have a datatype, they are actually always text values. You can enclose them in double quotes or not –but I usually prefer to – although if you have spaces or other special characters then quotes are required.

You define them as follows:
:SETVAR SQLCMDVariableName “Whatever value you want”

And where you want to refer to them in your script you use a dollar sign and the variable name in brackets:

Rather than being a conventional form of variables, SQLCMD variables are actually tags for text replacement. It’s handy to understand this as it leads to some strange behaviours. What happens when you run a query with SQLCMD enabled, is that first of all the script is parsed and any SQLCMD statements are processed.

In the case of SQLCMD variables, first all the :SETVAR statements in the script are processed and each variable is assigned the correct value. Then all the references to each variable in the script are replaced with the literal value, it is then this modified version of your script (which you never get to see) which gets executed.

That’s why something like this doesn’t work:

:SETVAR TextVal "Hello There"
DECLARE @TextVal varchar(30);

SET @TextVal = $(TextVal);
SELECT @TextVal;

When I run this I get an error:
Msg 102, Level 15, State 1, Line 11
Incorrect syntax near ‘There’.

What’s going on? Both my SQL and SQLCMD variables are text aren’t they? Why doesn’t that work?

The answer lies in what I said before, the reference to a SQLCMD variable is just a tag to be replaced with the value defined elsewhere in the script. So in the above example what actually gets executed is:

DECLARE @TextVal varchar(30);

SET @TextVal = Hello There;
SELECT @TextVal;

Which isn’t valid SQL. What I should have done in my original SQL is to wrap the reference to the SQLCMD variable in single quotes:

:SETVAR TextVal "Hello There"
DECLARE @TextVal varchar(30);

SET @TextVal = '$(TextVal)';
SELECT @TextVal;

Now it works:


I mentioned you could pass SQLCMD variables from the command line – this can be handy if you’re executing scripts and you want to (for instance) specify the database name from outside. Watch out though, if you also assign a value in your script then it is the last value assigned that gets used.

I had a developer come to me complaining that SQL wasn’t picking up the SQLCMD variable he was passing through the command line, the answer was that he had another value assigned in the script. He thought that was dumb, so I asked the question “What would you expect to happen if you were writing C# code and assigned a value to a variable, and then assigned a new one to it – which would you expect it to hold – the first or the second?”

That doesn’t mean however that assignment of values to SQLCMD variables doesn’t display some counterintuitive behaviour. Look at the following query:

:SETVAR TextVal "Hello There"
SELECT '$(TextVal)';

:SETVAR TextVal "GoodBye"
SELECT '$(TextVal)';

So I set a value in my SQLCMD variable, output it with a select statement, then I change the value and output it again. Let’s look at the output:
What the…?! I’ve encountered issues before where I’ve tried to change the value of a variable and – having done something wrong – the value hasn’t updated. But here it looks like the first query is looking into the future!

This goes back to what I said earlier, first the :SETVAR statements are processed and the variable evaluated, only then are the references replaced in the script. This means you can’t have changing values for your SQLCMD variable throughout the execution of your script.

You can even see the same behaviour if you do this:

:SETVAR TextVal "Hello There"
SELECT '$(TextVal)';

:SETVAR TextVal "GoodBye"
SELECT '$(TextVal)';

:SETVAR TextVal "See you Later!"

I’ve said you can’t change the value of your SQLCMD variable through your script, technically it’s more accurate to say you can’t have different values within the same batch. So if you separate your script into separate batches using the GO statement, then you get a different result:

:SETVAR TextVal "Hello There"
SELECT '$(TextVal)';

:SETVAR TextVal "GoodBye"
SELECT '$(TextVal)';

You might therefore think that the SQLCMD variable is only valid in the context of the batch in which is defined. So if I remove the :SETVAR in the second batch my script will fail:

:SETVAR TextVal "Hello There"
SELECT '$(TextVal)';

SELECT '$(TextVal)';

Not so:
We see from this that a SQLCMD variable is not limited to the scope of a single batch – even though it gets re-evaluated on a batch by batch basis.

I’ll finish with something you might have attempted to do at some point. How about if I conditionally try to change a SQLCMD variable:

:SETVAR TextVal "Hello There"
IF 1=0
   PRINT 'Whoah!'
   :SETVAR TextVal "Maths is Broken"

SELECT '$(TextVal)';

If I’ve not confused you too much with the above examples you can probably figure out what the output will be. That’s right:

This has just reminded me of a quote from “The Hitchhikers Guide to the Galaxy“ about the babel fish, particularly the last line:
“Now it is such a bizarrely improbable coincidence that anything so mind-bogglingly useful could have evolved purely by chance that some thinkers have chosen to see it as the final and clinching proof of the non-existence of God.
The argument goes something like this: “I refuse to prove that I exist,'” says God, “for proof denies faith, and without faith I am nothing.”

“But,” says Man, “The Babel fish is a dead giveaway, isn’t it? It could not have evolved by chance. It proves you exist, and so therefore, by your own arguments, you don’t. QED.”

“Oh dear,” says God, “I hadn’t thought of that,” and promptly vanishes in a puff of logic.

“Oh, that was easy,” says Man, and for an encore goes on to prove that black is white and gets himself killed on the next zebra crossing.”


The main take home from all this should be to avoid trying to use a SQLCMD variable like a normal one. Assign it once, at the top of your script or in a command line – then leave it alone!

How does Query Store capture cross database queries?

When I was writing my post  Capture the most expensive queries across your SQL Server using Query Store a question crossed my mind:

Query Store is a configuration that is enabled per database, and the plans and stats for queries executed in that database are stored in the database itself. So what does query store do when a query spans more than one database?

Does it record the execution stats in all databases involved or does it store them in one based on some criteria (e.g. the one where the most work occurs)? Or does it somehow proportion them out between the databases?

This was relevant as it crossed my mind that if it records them in multiple database then my query in the above post could be double counting.

Time to test and find out.

I created three databases, Fred, Bert and Ernie. Then a table called Fred in database Fred, and a table called Bert in database Bert. In table Fred I created a bunch of records, then in table Bert I created a much bigger bunch of records:



USE Fred;

INSERT INTO dbo.Fred(FredText)
FROM sys.objects a, sys.objects b;

USE Bert;

INSERT INTO dbo.Bert(BertText)
FROM sys.objects a, sys.objects b, sys.objects c;

Then I turned on Query Store for all three databases:

ALTER DATABASE Fred SET query_store = ON;
ALTER DATABASE Bert SET query_store = ON;
ALTER DATABASE Ernie SET query_store = ON;

Once that was done I concocted a horrible query that was bound to be horrendously slow – so I knew it would be easy to find when I queried the Query Store runtime stats:


SELECT TOP 100000 *
FROM Fred.dbo.Fred f
INNER JOIN Bert.dbo.Bert b
   ON b.BertText LIKE  '%' + f.FredText + '%';

I turned STATISTICS IO on so I could see how much work was happening in each database.

I ran the query first in a query window pointing at the Fred database, then I ran my query store query from the previous post (Capture the most expensive queries across your SQL Server using Query Store) to see what had been captured. I made it slightly easier for myself by adding an additional where clause to the cursor so that it only looked at these databases:

--Cursor to step through the databases
SELECT [name]
FROM sys.databases 
WHERE is_query_store_on = 1
AND name IN ('Fred','Bert','Ernie');

I cleared down Query Store for all the databases:


Then I repeated these steps for Bert and Ernie.

The Statistics IO for the query (regardless of which database context I had set) was as follows:
Table ‘Bert’. Scan count 24, logical reads 5095742, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Fred’. Scan count 25, logical reads 50, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

So you can see most of the work occurs in the Bert database, a little in Fred, and none in Ernie.

Now let’s see what query store captured when I ran the query pointing at database Fred:

And pointing at database Bert:

And pointing at database Ernie:

You can see that the figures get recorded against whichever database you are pointing at – regardless of where the data being accessed resides. I left the “TotalLogicalReads %” in the above screen shots so you can see I’m not hiding anything.

This has a few implications. First, I’m happy because it means my “Expensive queries” script isn’t double counting.

Second though, as you can’t turn on query store on in any of the system databases, you won’t be able to capture details for any queries executed with those as the context. That includes ad-hoc queries where the user may connect to master, but execute queries against your other databases.

Fortunately (because Query Store would be pretty pointless if it did) this doesn’t apply to stored procedures.

I’m going to wrap my horrible query into a stored procedure, and deploy it into database Ernie:

USE Ernie;
   SELECT TOP 100000 *
   FROM Fred.dbo.Fred f
   INNER JOIN Bert.dbo.Bert b
      ON b.BertText LIKE  '%' + f.FredText + '%';

Now I clear my Query Stores for the three database one last time. Then I’ll called the stored procedure from database Fred:

USE Fred;
EXEC Ernie.dbo.Horrible;

Here’s what I get from query store now:

So, Query Store logs the execution against database Ernie – where the stored procedure resides, rather than Fred – where it was called from, or Bert – where most of the work was done.

I hope you’ll trust me enough on that that I don’t have to demonstrate all the other combinations!

SQL Puzzle 4: The Beale Papers

I’ve not done a SQL puzzle for a while so thought it was getting overdue…

I set this one for my workmates a while ago and people found it quite fun.

The Beale Papers are a set of three ciphertexts which allegedly reveal the details of a buried treasure. The tale is that the treasure was buried by Thomas J. Beale in Bedford County, Virginia, USA back in the 1820s. He entrusted a box containing the ciphered messages to the care of a local innkeeper and then disappeared never to be seen again – his intent was that the papers could be passed on to his descendants and they could decode the messages and claim their rightful inheritance. The innkeeper passed the ciphertexts on to a friend before he died, and said friend spent the rest of his life trying to decipher them. He succeeded on the second message but failed on the others.


Since the 19th century many attempts have been made to decipher the other messages and find the treasure. Even without cracking the code, treasure hunters have got out their shovels over the years to dig up parts of Bedford County without finding anything (as far as we know). Recently there are some claims to have solved the puzzle, but that the treasure was already gone.Of course there’s a good chance the whole thing is a hoax!

Your task is to repeat the deciphering of the second message, but to make it easier I’ll tell you the key.

Beale used an existing text as a key for the cipher in which he assigned a number to each word, 1 for the first word, two for the second and so on.

Then for each letter in the text he wanted to encode he randomly picked a word in the key that started with that letter, and then he enciphered the letter as the number for that word. Make sense?

For instance, if the key text was “now is the time” and the plain text is “tin”, then either (3 2 1) or (4 2 1) are valid encipherments.

The key text Beale used for encoding the second message was the United States declaration of independence. I’m going to leave you to source your own copy of this. Be aware that some of the real world issues with this task that you may come up include the fact that Beale may have made some inaccuracies in his ciphering, and the copy of the key text he used to code his text is unlikely to be exactly the same as the one you use. As such you may need to do some tweaking, spelling correction and perhaps logic to work out what some of the words are. Good luck if you choose to attempt this!

So, starting with the two variables below, write some SQL to decode the message (@CipherText ):


SET @KeyText = '<Insert your copy of the declaration of independence here>';

SET @CipherText = '115, 73, 24, 807, 37, 52, 49, 17, 31, 62, 647, 22, 7, 15, 140, 47, 29, 107, 79, 84, 56, 239, 10, 26, 811, 5, 196, 308, 85, 52, 160, 136, 59, 211, 36, 9, 46, 316, 554, 122, 106, 95, 53, 58, 2, 42, 7, 35, 122, 53, 31, 82, 77, 250, 196, 56, 96, 118, 71, 140, 287, 28, 353, 37, 1005, 65, 147, 807, 24, 3, 8, 12, 47, 43, 59, 807, 45, 316, 101, 41, 78, 154, 1005, 122, 138, 191, 16, 77, 49, 102, 57, 72, 34, 73, 85, 35, 371, 59, 196, 81, 92, 191, 106, 273, 60, 394, 620, 270, 220, 106, 388, 287, 63, 3, 6, 191, 122, 43, 234, 400, 106, 290, 314, 47, 48, 81, 96, 26, 115, 92, 158, 191, 110, 77, 85, 197, 46, 10, 113, 140, 353, 48, 120, 106, 2, 607, 61, 420, 811, 29, 125, 14, 20, 37, 105, 28, 248, 16, 159, 7, 35, 19, 301, 125, 110, 486, 287, 98, 117, 511, 62, 51, 220, 37, 113, 140, 807, 138, 540, 8, 44, 287, 388, 117, 18, 79, 344, 34, 20, 59, 511, 548, 107, 603, 220, 7, 66, 154, 41, 20, 50, 6, 575, 122, 154, 248, 110, 61, 52, 33, 30, 5, 38, 8, 14, 84, 57, 540, 217, 115, 71, 29, 84, 63, 43, 131, 29, 138, 47, 73, 239, 540, 52, 53, 79, 118, 51, 44, 63, 196, 12, 239, 112, 3, 49, 79, 353, 105, 56, 371, 557, 211, 505, 125, 360, 133, 143, 101, 15, 284, 540, 252, 14, 205, 140, 344, 26, 811, 138, 115, 48, 73, 34, 205, 316, 607, 63, 220, 7, 52, 150, 44, 52, 16, 40, 37, 158, 807, 37, 121, 12, 95, 10, 15, 35, 12, 131, 62, 115, 102, 807, 49, 53, 135, 138, 30, 31, 62, 67, 41, 85, 63, 10, 106, 807, 138, 8, 113, 20, 32, 33, 37, 353, 287, 140, 47, 85, 50, 37, 49, 47, 64, 6, 7, 71, 33, 4, 43, 47, 63, 1, 27, 600, 208, 230, 15, 191, 246, 85, 94, 511, 2, 270, 20, 39, 7, 33, 44, 22, 40, 7, 10, 3, 811, 106, 44, 486, 230, 353, 211, 200, 31, 10, 38, 140, 297, 61, 603, 320, 302, 666, 287, 2, 44, 33, 32, 511, 548, 10, 6, 250, 557, 246, 53, 37, 52, 83, 47, 320, 38, 33, 807, 7, 44, 30, 31, 250, 10, 15, 35, 106, 160, 113, 31, 102, 406, 230, 540, 320, 29, 66, 33, 101, 807, 138, 301, 316, 353, 320, 220, 37, 52, 28, 540, 320, 33, 8, 48, 107, 50, 811, 7, 2, 113, 73, 16, 125, 11, 110, 67, 102, 807, 33, 59, 81, 158, 38, 43, 581, 138, 19, 85, 400, 38, 43, 77, 14, 27, 8, 47, 138, 63, 140, 44, 35, 22, 177, 106, 250, 314, 217, 2, 10, 7, 1005, 4, 20, 25, 44, 48, 7, 26, 46, 110, 230, 807, 191, 34, 112, 147, 44, 110, 121, 125, 96, 41, 51, 50, 140, 56, 47, 152, 540, 63, 807, 28, 42, 250, 138, 582, 98, 643, 32, 107, 140, 112, 26, 85, 138, 540, 53, 20, 125, 371, 38, 36, 10, 52, 118, 136, 102, 420, 150, 112, 71, 14, 20, 7, 24, 18, 12, 807, 37, 67, 110, 62, 33, 21, 95, 220, 511, 102, 811, 30, 83, 84, 305, 620, 15, 2, 10, 8, 220, 106, 353, 105, 106, 60, 275, 72, 8, 50, 205, 185, 112, 125, 540, 65, 106, 807, 138, 96, 110, 16, 73, 33, 807, 150, 409, 400, 50, 154, 285, 96, 106, 316, 270, 205, 101, 811, 400, 8, 44, 37, 52, 40, 241, 34, 205, 38, 16, 46, 47, 85, 24, 44, 15, 64, 73, 138, 807, 85, 78, 110, 33, 420, 505, 53, 37, 38, 22, 31, 10, 110, 106, 101, 140, 15, 38, 3, 5, 44, 7, 98, 287, 135, 150, 96, 33, 84, 125, 807, 191, 96, 511, 118, 40, 370, 643, 466, 106, 41, 107, 603, 220, 275, 30, 150, 105, 49, 53, 287, 250, 208, 134, 7, 53, 12, 47, 85, 63, 138, 110, 21, 112, 140, 485, 486, 505, 14, 73, 84, 575, 1005, 150, 200, 16, 42, 5, 4, 25, 42, 8, 16, 811, 125, 160, 32, 205, 603, 807, 81, 96, 405, 41, 600, 136, 14, 20, 28, 26, 353, 302, 246, 8, 131, 160, 140, 84, 440, 42, 16, 811, 40, 67, 101, 102, 194, 138, 205, 51, 63, 241, 540, 122, 8, 10, 63, 140, 47, 48, 140, 288';


I’ll finish this post with an excerpt from the comments on a blog I found where someone claims to have decoded the other two texts as well, and know where the treasure is:

Have fun!

Previous puzzles on this blog:
SQL Puzzle 1: Magic Squares
SQL Puzzle 2: Eight QueensSQL Puzzle 3: Knights and Queens

Credit to Programming Praxis where I first saw a version of this puzzle – and where you can find many more!

Capture the most expensive queries across your SQL Server using Query Store

I’m a big fan of using queries based on the dynamic management view sys.dm_exec_query_stats to capture the most resource hungry queries across a SQL instance.

That approach has some drawbacks though. First, it is cleared out every time an instance restarts, and second it only keeps figures for currently cached plans, so when a query recompiles, data is lost.

It was some time ago I blogged about SQL Server Query Store but one of the great features of it is that performance stats are stored in the database, so they aren’t lost in either of the above scenarios.

So I wanted to write a query store equivalent that will produce a ranked list of your most expensive queries.

The only downside is that the DMVs for Query Store are per database whereas dm_exec_query_stats is a view across the whole instance. So I had to use a cursor and a temp table, populating the temp table for each database in turn.

Anyway, enough preamble – here’s the query:

--Gather and report on most resource hungry queries
DECLARE @Reportinginterval int;
DECLARE @Database sysname;
DECLARE @StartDateText varchar(30);
DECLARE @TotalExecutions decimal(20,3);
DECLARE @TotalDuration decimal(20,3);
DECLARE @TotalCPU decimal(20,3);
DECLARE @TotalLogicalReads decimal(20,3);
DECLARE @SQL varchar(MAX);

--Set Reporting interval in days
SET @Reportinginterval = 1;

SET @StartDateText = CAST(DATEADD(DAY, -@Reportinginterval, GETUTCDATE()) AS varchar(30));

--Cursor to step through the databases
SELECT [name]
FROM sys.databases 
WHERE is_query_store_on = 1;

--Temp table to store the results
   DatabaseName sysname,
   SchemaName sysname NULL,
   ObjectName sysname NULL,
   QueryText varchar(1000),
   TotalExecutions bigint,
   TotalDuration decimal(20,3),
   TotalCPU decimal(20,3),
   TotalLogicalReads bigint

OPEN curDatabases;
FETCH NEXT FROM curDatabases INTO @Database;

--Loop through the datbases and gather the stats
    SET @SQL = '
	   USE [' + @Database + ']
	   INSERT intO #Stats
		  DB_NAME(), AS SchemaName, AS ObjectName,
		  SUBSTRING(t.query_sql_text,1,1000) AS QueryText,
		  SUM(rs.count_executions) AS TotalExecutions,
		  SUM(rs.avg_duration * rs.count_executions) AS TotalDuration,
		  SUM(rs.avg_cpu_time * rs.count_executions) AS TotalCPU,
		  SUM(rs.avg_logical_io_reads * rs.count_executions) AS TotalLogicalReads
	   FROM sys.query_store_query q
	   INNER JOIN sys.query_store_query_text t
		  ON q.query_text_id = t.query_text_id
	   INNER JOIN sys.query_store_plan p
		  ON q.query_id = p.query_id
	   INNER JOIN sys.query_store_runtime_stats rs
		  ON p.plan_id = rs.plan_id
	   INNER JOIN sys.query_store_runtime_stats_interval rsi
		  ON rs.runtime_stats_interval_id = rsi.runtime_stats_interval_id
	   LEFT JOIN sys.objects o
	   LEFT JOIN sys.schemas s
		  ON o.schema_id = s.schema_id     
	   WHERE rsi.start_time > ''' + @StartDateText + '''
	   GROUP BY,, SUBSTRING(t.query_sql_text,1,1000)

    EXEC (@SQL);

    FETCH NEXT FROM curDatabases INTO @Database;

CLOSE curDatabases;
DEALLOCATE curDatabases;

--Aggregate some totals
    @TotalExecutions = SUM(TotalExecutions),
    @TotalDuration = SUM (TotalDuration),
    @TotalCPU  = SUM(TotalCPU),
    @TotalLogicalReads = SUM(TotalLogicalReads)
FROM #Stats

--Produce output
    CAST((TotalExecutions/@TotalExecutions)*100 AS decimal(5,2)) AS [TotalExecutions %],
    CAST(TotalDuration/1000000 AS decimal(19,2)) AS [TotalDuration(s)],
    CAST((TotalDuration/@TotalDuration)*100 AS decimal(5,2)) AS [TotalDuration %],
    CAST((TotalDuration/TotalExecutions)/1000 AS decimal(19,2)) AS [AverageDuration(ms)],
    CAST(TotalCPU/1000000  AS decimal(19,2)) [TotalCPU(s)],
    CAST((TotalCPU/@TotalCPU)*100 AS decimal(5,2)) AS [TotalCPU %],
    CAST((TotalCPU/TotalExecutions)/1000 AS decimal(19,2)) AS [AverageCPU(ms)],   
    CAST((TotalLogicalReads/@TotalLogicalReads)*100 AS decimal(5,2)) AS [TotalLogicalReads %],
  CAST((TotalLogicalReads/TotalExecutions) AS decimal(19,2)) AS [AverageLogicalReads]   
FROM #Stats
--Order by the resource you're most interested in

--ORDER BY TotalExecutions DESC
--ORDER BY TotalDuration DESC
--ORDER BY TotalLogicalReads DESC


The script limits itself to looking at databases where query store is enabled.

If you want to bring back more results you can just change the TOP statement, and if you want to look at the results ordered by a different resource (e.g. Reads) then just make sure the relevant ORDER BY clause is uncommented.

I consider this something that will evolve over time, so if you have any suggestions of things you think I should change or add then let me know.

Implementing Temporal Tables Where You Have Existing Data

In my post about auditing tools in SQL Server I mentioned a few tools we have at our disposal for capturing the change history of data.

One was Change Data Capture which I want into a bit more detail about in Setting up Change Data Capture (CDC)

I also referred to Temporal Tables which are available to us in the brave new world of SQL 2016.

Temporal tables aren’t just about monitoring change, they also provide really nice methods for being able to query historical data – to see what the values were at a particular point in time e.g.

SELECT * FROM dbo.SomeData FOR SYSTEM_TIME AS OF '1900-01-01';

This makes them easy to work with, but also opens up other applications for them over and above auditing purposes – for instance being able to maintain configuration data that changes over time, but where we need to reference what the value was historically.

I wanted to look in more detail about the process for changing an existing table so that it becomes a temporal one.

Let’s start by quickly looking at how we would define a temporal table for a fresh deployment:

CREATE TABLE dbo.SomeData(
   SomeText VARCHAR(1000) NULL,
   ModifiedBy VARCHAR(128) NULL,
   PERIOD FOR SYSTEM_TIME (StartDate, EndDate)

There are a few things here additional to a normal table definition. First thing to note is we need two columns to support the Temporal versioning. I’ve called my columns StartDate and EndDate, these both have to be of DATETIME2 datatype to the highest precision (100 nanoseconds). These are specified as GENERATED ALWAYS AS ROW START\END so SQL will take care of assigning values when new rows are inserted or updated.

I’ve specified the PERIOD FOR SYSTEM TIME, so SQL knows that these are the fields to use for the temporal versioning.

Finally I’ve specified SYSTEM_VERSIONING for the table and what the history table should be called.

When I execute this you can find the table and corresponding history table displayed in SSMS as below:
We have the table itself, and also the history table has been created for us. This is where the previous versions of rows of data will be stored.

So that’s pretty straightforward – but what if you want to change an existing table?

First of all let’s drop this one:

DROP TABLE dbo.SomeData;

Unfortunately that gives me an error message:
Msg 13552, Level 16, State 1, Line 90
Drop table operation failed on table ‘TestAudit.dbo.SomeData’ because it is not supported operation on system-versioned temporal tables.

As the message says, you can’t just drop Temporal tables. First you must turn system versioning off, then you can go ahead:

DROP TABLE dbo.SomeData;

Dropping the table though leaves the History table behind – now seen as a normal table: Tempopral2

That’s handy as it means you don’t lose the history when the main table is deleted. For now though let’s just delete that as well:

DROP TABLE dbo.SomeData_History;

Now let’s create a new table that will be our “existing table” we want to change. For this example I’ll create a test table with a trigger that maintains a modified date, and throw in 1000 rows:

CREATE TABLE dbo.SomeData(
   Id INT IDENTITY(1,1) 
   SomeText VARCHAR(1000) NULL,
   ModifiedBy VARCHAR(128) NULL,
   ModifiedDate DATETIME NOT NULL 

CREATE TRIGGER dbo.TR_SomeData_ModifiedDate
   ON dbo.SomeData 
   UPDATE dbo.SomeData 
   SET ModifiedDate = GETUTCDATE()
   FROM dbo.SomeData 
   INNER JOIN Inserted
      ON dbo.SomeData.Id = Inserted.Id;

INSERT INTO dbo.SomeData(SomeText, ModifiedBy)
SELECT TOP 1000 'Blah', 'Me'
FROM sys.objects a CROSS JOIN sys.objects b;

I’m going to complicate things and say that when I turn this into a temporal table I want to keep the existing ModifiedDate column and use that as the start date for my system versioning.

The first thing you might try is to alter the column:

ALTER TABLE dbo.SomeData

Msg 13589, Level 16, State 1, Line 44
Column ‘ModifiedDate’ in table ‘TestAudit.dbo.SomeData’ cannot be specified as ‘GENERATED ALWAYS’ in ALTER COLUMN statement.

So I can’t do this in an alter statement. What I’ll actually have to do is create a new column specified to work with the system versioning, copy the data over, drop the old column and rename the new one.

First let’s create the additional columns and define the PERIOD FOR SYSTEM_TIME:

ALTER TABLE dbo.SomeData
StartDate datetime2 GENERATED ALWAYS AS ROW start NOT NULL
    CONSTRAINT DF_SomeData_EndDate DEFAULT ('9999-12-31 23:59:59.9999999'),

Note that I’ve had to specify defaults for the new columns so that the existing data gets values populated (I didn’t have to do that when I simply defined the Temporal Table from scratch).

For the StartDate I’ve specified the current UTC date – this could be any date and time that’s not in the future, though note it should be a UTC one. If I’d tried to use GETDATE, as I’m currently on British Summer Time, I would get the following error:
Msg 13542, Level 16, State 0, Line 51
ADD PERIOD FOR SYSTEM_TIME on table ‘TestAudit.dbo.SomeData’ failed because there are open records with start of period set to a value in the future.

For the end date I have put the maximum value a DATETIME2 column can hold, this is required and is a way of saying that this record is (at the moment) valid indefinitely. If I try to put in an earlier date I receive this error:
Msg 13575, Level 16, State 0, Line 51
ADD PERIOD FOR SYSTEM_TIME failed because table ‘TestAudit.dbo.SomeData’ contains records where end of period is not equal to MAX datetime.

Now I’ve got the columns I could try copying my data over:

UPDATE dbo.SomeData SET StartDate = ModifiedDate;

I get another error:
Msg 13537, Level 16, State 1, Line 70
Cannot update GENERATED ALWAYS columns in table ‘TestAudit.dbo.SomeData’.

It turns out you can get around this by temporarily dropping the period for system time:

UPDATE dbo.SomeData SET StartDate = ModifiedDate;

Your logical thought might be here that I didn’t need to create the period for system time when I altered the table to add the new columns. I could have just added the columns, migrated the data, then only added the period afterwards. Trust me, I tried that too:
Msg 13509, Level 16, State 1, Line 51
Cannot create generated always column when SYSTEM_TIME period is not defined.

From here it turns out to be reasonable clear sailing.

Let’s run the final command to make the table Temporal:

ALTER TABLE dbo.SomeData 

Here’s what it looks like in SSMS:
You can see this is the same as the original fresh creation of a temporal table. The history table has been created for us. Just with an additional column in both tables.

Let’s remove the extra column and do the final rename:

ALTER TABLE dbo.SomeData DROP CONSTRAINT DF_SomeData_ModifiedDate;
ALTER TABLE dbo.SomeData DROP COLUMN ModifiedDate;
EXEC sp_rename 'dbo.SomeData.StartDate','ModifiedDate','COLUMN';

I was amazed the sp_rename didn’t give me an error. In fact if we look at the table again:
You can see the column has been renamed in both the main table and the history table. Pretty neat!

I have at this stage forgotten to do one thing. Let’s just try to update a record and see what happens:

UPDATE dbo.SomeData SET SomeText = 'Wibble' WHERE Id = 2;

Msg 13537, Level 16, State 1, Procedure TR_SomeData_ModifiedDate, Line 7 [Batch Start Line 185]
Cannot update GENERATED ALWAYS columns in table ‘TestAudit.dbo.SomeData’.
The statement has been terminated.

Yep, don’t forget to remove your trigger!

DROP TRIGGER dbo.TR_SomeData_ModifiedDate;

And now we’re all done.


What if I use SQL Server Data Tools (SSDT) to make my change?

If you’re working with Database Projects in Visual Studio then you will be used to publishing changes straight from there – or deploying them using a DACPAC file.

In that case SSDT handles all this for you – though it does use a slightly different method. Rather than creating a new column, copying that data over, dropping the old column and renaming the new one, it does it at the table level.
That’s the same way some other changes are made with SSDT (or if you do them through the GUI in SSMS). The steps are broadly the same, just with tables instead of columns:

1) Create new temporal table
3) Copy across the data from the old table to the new
5) Drop the old table
6) Rename the new table to the original name

If you’re doing this by “hand”, you could do it this way if you prefer, but it is probably a heavier operation that will require more down time. For instance you’ll need the whole table to be locked while the operation progresses as you don’t want to risk data changes once you’ve started copying the data over. The column method will also lock the table, but potentially there is less to update\copy so it should hopefully be quicker.

As a final point I should note that you may run into problems dropping the trigger with SSDT, it can be a little quirky around doing that. For the moment I’ll leave off from going into that in more detail – but if you get that issue feel free to give me a yell!


I love the STATISTICS IO and STATISTICS TIME commands. They are such a powerful and easy way to be able to measure improvements when performance tuning (Measuring SQL Query Performance).

You have to be a little bit wary though, there are a few places where they don’t report figures correctly such as this bug (fixed from 2014):

Also anything to do with a MSTVF (multi-statement table value function) is likely to be reported incorrectly (another reason not to use such functions!).

Today I was doing some testing to compare different methods of auditing data, and I came up against another (bug?) I hadn’t seen before. So I thought I’d share.

This is running on SQL 2016 – I haven’t checked it on earlier versions.

I’ll create two identical tables for comparison, identical except one has a trigger against it that inserts changes to an audit table.

USE StatisticsBug;
SomeText VARCHAR(1000),
ModifiedBy VARCHAR(128),
ModifiedDate DATETIME
CREATE TABLE TableWithTrigger
SomeText VARCHAR(1000),
ModifiedBy VARCHAR(128),
ModifiedDate DATETIME
CREATE TABLE TableWithTrigger_Audit
SomeText VARCHAR(1000),
ModifiedBy VARCHAR(128),
ModifiedDate DATETIME,
CREATE TRIGGER TableWithTrigger_trigger
ON dbo.TableWithTrigger
INSERT INTO dbo.TableWithTrigger_Audit(Id, SomeText, ModifiedBy, ModifiedDate, AuditDate)
SELECT Id, SomeText, ModifiedBy, ModifiedDate, GETDATE()
FROM deleted;
INSERT INTO dbo.TableWithTrigger_Audit(Id, SomeText, ModifiedBy, ModifiedDate, AuditDate)
SELECT Id, SomeText, ModifiedBy, ModifiedDate, GETDATE()
FROM inserted;

Now I’m going to insert a bunch of rows into each table and capture the CPU output so we can compare. Obviously I expect the one with the trigger to have to do more work, I’m just demonstrating both to show that the first example works correctly, but the second one seems a bit more dubious.

First the table without the trigger:

--Insert a million rows into table without trigger, capture CPU

INSERT INTO dbo.VanillaTable(SomeText, ModifiedBy, ModifiedDate)
SELECT TOP 1000000  'SomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeText', 'Matt', GETDATE()
FROM sys.objects a, sys.objects b, sys.objects c, sys.objects d;

Let’s look at the output:
SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 64 ms.
SQL Server Execution Times:
CPU time = 4766 ms, elapsed time = 5596 ms.

That all seems quite reasonable and what we’d expect to see.

Now let’s repeat the same test for our table with a trigger:

--Insert a million rows into table with trigger, capture CPU

INSERT INTO dbo.TableWithTrigger(SomeText, ModifiedBy, ModifiedDate)
SELECT TOP 1000000  'SomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeTextSomeText', 'Matt', GETDATE()
FROM sys.objects a, sys.objects b, sys.objects c, sys.objects d

And look again at the CPU:
SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 31 ms.
SQL Server parse and compile time:
CPU time = 6031 ms, elapsed time = 8782 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 7266 ms, elapsed time = 9205 ms.

SQL Server Execution Times:
CPU time = 13297 ms, elapsed time = 17992 ms.

Overall the execution takes over twice as long, which I expect as – due to my trigger – I’m now inserting to two tables. This is represented by the last entry which is the total execution time.

I want to draw your attention though to the second set of parse and compile time figures (highlighted in bold). These seem to be saying that to compile the query took over 6 seconds of CPU. I find this highly suspect.

This is also highlighted by the fact that the final execution time is actually the sum of 6031 (which is reporting as parse and compile) and 7266.
So my belief was that the output was incorrectly reporting execution time as compile time.

I captured the actual execution plan for the query to check the compile time. In the plan XML you can find the element CompileCPU (measured in ms).

As there are three statements covered by the plan (the initial insert plus two more in the trigger) I find three values:

Statement 1:


Statement 2:


Statement 3:


28ms, 1ms and 1ms for a total of 30ms. That’s actually pretty close to the initial figure reported in the statistics output, any difference is probably just rounding:
SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 31 ms.

That seems double confirmation to me that the output should be reporting the 6031ms and 8782ms not as CPU and Elapsed for Parse and Compile, but rather CPU and Elapsed for Execution.

I’d be interested if anyone has any other explanation of this, or to know if you can repeat it. With a bit more confirmation I’ll raise a Connect item.

Thanks for reading!