When you first look at the encryption hierarchy for TDE in SQL Server it can be a bit daunting. There seem to be a lot of objects involved and it might not be clear why each is required. It can be tempting to skip a full understanding of all the objects and just get on with setting things up – which is relatively straightforward.
I’d encourage you not to do that and I’ll explain why. There are a lot of scenarios that might crop up in the lifecycle of a TDE protected database instance. Recovering a protected database from backup. Migrating database from one server to another. Managing high availability. The list goes on.
There are a lot of resources out there that will advise you on how to do these things – but many of them are inaccurate. Only by understanding how TDE works – and in particular how it uses the various encryption objects, can you be sure what approach is correct. And better than that – if you have a good understanding of the architecture of TDE you won’t need these resources, the correct approach will follow logically from your understanding of how things work.
TDE uses multiple keys and certificates in the process of protecting your database. This is something we’ll see in common with other forms of encryption. The reason for that is both about providing the best level of protection as well as supporting ease of management for your protected databases.
Keys in SQL Server often have at least three components:
- The Encryption Key itself – Usually can be thought of a number expressed in binary format. Long and random enough to make it difficult to guess even by brute force attempts. This is the key that is actually used to encrypt and decrypt your data.
- Another object that’s used to protect the key – This object might be another key, it might be a certificate, or it might just be a password. This object is used to encrypt the encryption key.
- The encrypted value of the encryption key – Formed from the original value of the key, encrypted by the protecting object.
In SQL we rarely (maybe even never) see the actual value of the key. We have the encrypted version and we usually know what object was used to encrypt it. That second object may even be another key that is itself encrypted by a third object.
When it comes down to it though, the actual thing that is used to encrypt or decrypt data is the key itself, not the encrypted value, and not the hierarchy of objects that may have been used to protect it.
So, all I really need to read your data, is your key.
Let’s look at that in the context of TDE. Here we see the hierarchy of encryption objects supporting TDE. There are other ways of working with TDE but the approach shown here is the standard.
Let’s look at the objects one at a time.
Database Encryption Key (DEK)
Right at the bottom of the hierarchy is the DEK. This is stored in the database itself and is what is actually used to encrypt and decrypt data in the database. The DEK is a symmetric key, which means the same key is used to both encrypt and decrypt data. Symmetric key encryption is much quicker than asymmetric key encryption which is why such a key is used in this case. We want the “transparent” encryption activities to have as little overhead as possible while keeping things secure.
The DEK is actually stored encrypted in the database. Encryption wouldn’t be very effective if the key was stored in plain sight for anyone to access and use. As it is stored encrypted, even if someone has your files, there is no way for them to access the unencrypted version of the DEK and use it to read your data.
The encrypted DEK is also stored in any database backups
Certificate and Associated Asymmetric Key Pair
At the next level up we have the Certificate, which contains a Public Key which can be used to encrypt data and has a reference to a Private Key which must be used to decrypt data. The two keys together are known as a Public\Private Key pair and are used to encrypt the DEK using asymmetric encryption.
The Public key can be held in plain sight as it can only be used to encrypt data. The Private Key is required to decrypt data. The Private Key is in turn encrypted to make sure it is protected.
There are other options for protecting your DEK which include using an asymmetric key that is stored externally to your SQL Server. This is managed through something called Extensible Key Management (EKM) and we’ll look at that in later posts.
Database Master Key (DMK)
The DMK is stored in the Master database and is what’s used to encrypt the private key. However, the DMK isn’t just used for your database, and indeed isn’t just used for TDE. It can support a number of activities. There can be only one DMK for your SQL instance. The DMK is in turn encrypted.
Service Master Key (SMK)
The SMK is created when you first install your SQL Server instance and is unique to the instance. The SMK is used to protect your DMK. It is in turn protected by the operating system Data Protection API (DPAPI).
Understanding the Need for the Hierarchy
That’s a lot of layers of encryption to deal with. It can be difficult to understand why we need so many.
Let’s look at one scenario to aid our understanding. We’ll look at how it works in practice and what the problems might be if we didn’t have all these different levels of encryption.
Let’s say you want to restore a TDE protected database to another server – a common requirement. The encrypted DEK is already stored in your backup file – so we have that. You would be forgiven however in thinking that we need to migrate copies of all the layers above it in order to be able to read our protected database.
We run into problems however as soon as we think about migrating the DMK. Remember there can only be one DMK per instance, what if the instance we are restoring to already has a DMK, and that DMK is already used to protect other objects. We can’t just replace it. That is why it is critical there must be an object between the DMK and the DEK, the DMK cannot be used to encrypt the DEK directly.
However, we still need the DMK to read the private key associated with the certificate. Don’t we?
In practice you are able – and pretty much required – to back up the certificate to disk and also the private key associated with it. If you don’t do this and it is lost (for instance if the server crashes and is unrecoverable) then your data is pretty much lost.
The command for backing up the certificate looks something like this:
BACKUP CERTIFICATE MyTDECert
TO FILE = 'C:\Test\MyTDECert.cer'
WITH PRIVATE KEY(
FILE = 'C:\Test\MyTDECert_PrivateKeyFile.pvk',
ENCRYPTION BY PASSWORD = 'C0rrecth0rserbatterystab1e'
You can see that when we backup the certificate we specify a password to encrypt the private key. Remember that the private key was already encrypted by the DMK? This BACKUP CERTIFICATE command doesn’t just encrypt it a second time – what would be the point of that? No, the reason we need to supply a password is that the command retrieves the unencrypted version of the private key, re-encrypts it with the password INSTEAD and then that is what gets saved to disk.
Remember that at the point I run the command SQL has access to all the objects, all the way up the chain, that are used for the encryption. So, it has no problem getting the actual value for the private key.
Now, when I restore that certificate to – let’s say another instance of SQL Server – the command looks like this:
CREATE CERTIFICATE MyTDECert
FROM FILE = 'C:\Test\MyTDECert.cer'
WITH PRIVATE KEY(
FILE = 'C:\Test\MyTDECert_PrivateKeyFile.pvk',
DECRYPTION BY PASSWORD = 'C0rrecth0rserbatterystab1e'
Considering what the BACKUP command did, you can figure out that the above command will do the opposite. It first of all decrypts the private key using the password supplied. Then it encrypts it again using the DMK for the local instance before saving it in the master database locally.
Remember it is the actual values of keys that get used for encryption – not their encrypted value. So, the private key for this certificate is exactly the same as it was where we backed it up from – though the encrypted value will be different as it is protected by a different DMK.
If I now go to restore a copy of a database whose DEK has been encrypted with this certificate onto the new instance I can do so without any problem.
Many articles will tell you that you also need to migrate the DMK from your old instance, and some will say that you also need the SMK. This is simply not correct – I mean, it will work if you do that, in the right order, but it isn’t necessary.
So all you need to have to restore a database protected by TDE, is the database backup, the certificate/private key backups, and the password specified when the certificate and key were backed up.
Equally that’s all someone else needs too, so make sure those things are protected.
There are other reasons it would not be desirable to have the DEK directly protected by the DMK. We can have a separate certificate for each database protected by TDE if we wish – and I would suggest that is good practice. Someone could steal the backup of that, not a big deal unless they have the password used to protect it also. If they do, then they can access our data. That’s not good, but at least we have minimized the attack surface and they have only accessed one database. If the DMK was used to protect the DEK directly then we would need to back that up in a similar manner – and if that was stolen in the same way then protection for all objects and databases protected by the DMK would be breached.
We’ll see other scenarios where the multiple levels in the encryption hierarchy for TDE help us in the later posts when we talk about managing TDE and its lifecycle with your databases.
This post is part of a comprehensive series on encryption in SQL Server. If you want it all in one go you can buy my book: