Encrypting an existing database with TDE

As mentioned previously, the process of setting up TDE (Setting up Transparent Data Encryption (TDE)) is the same whether you’ve just set up a new database, or whether you’re working with a live database. Once you turn encryption on SQL Server will begin the process of encrypting any data in your database. Be that 1 row, or be that terabytes of data.

I’m going to load up my database with about 1 GB of data so we can get an idea of how long this process takes.

CREATE DATABASE TestTDE;
USE TestTDE;
CREATE TABLE dbo.SomeData(Id INT IDENTITY(1,1), SomeText VARCHAR(255));

INSERT INTO dbo.SomeData (SomeText) 
SELECT TOP 1000000 
('XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')
FROM sys.objects a
CROSS JOIN sys.objects b
CROSS JOIN sys.objects c
CROSS JOIN sys.objects d;
GO 10

I then go through all the steps mentioned previously to set up encryption:

Setting up Transparent Data Encryption (TDE)

After the last step:

ALTER DATABASE TestTDE SET ENCRYPTION ON;

I run a query to report the progress:

DECLARE @state tinyint;
DECLARE @encyrption_progress 
    TABLE(sample_time DATETIME, percent_complete DECIMAL(5,2))

SELECT @state = k.encryption_state
FROM sys.dm_database_encryption_keys k
INNER JOIN sys.databases d
   ON k.database_id = d.database_id
WHERE d.name = 'TestTDE';

WHILE @state != 3
BEGIN
   INSERT INTO @encyrption_progress(sample_time, percent_complete)
   SELECT GETDATE(), percent_complete
   FROM sys.dm_database_encryption_keys k
   INNER JOIN sys.databases d
      ON k.database_id = d.database_id
   WHERE d.name = 'TestTDE';


   WAITFOR delay '00:00:01';

   SELECT @state = k.encryption_state
   FROM sys.dm_database_encryption_keys k
   INNER JOIN sys.databases d
      ON k.database_id = d.database_id
   WHERE d.name = 'TestTDE'; 
END

SELECT * FROM @encyrption_progress;

I’m sampling every second, let’s look at the results:

TDE_Progress

You can see that the encryption took about 30 seconds to complete for (just under) 1GB of data. This isn’t intended to be a formal benchmark, but rather just to give you an idea of the order of magnitude of time this might take to encrypt your own databases.

Let’s say – just for the same of argument – that this scaled up linearly on a given system and you wanted to encrypt a 1TB database. Then it might take as long as 500 minutes (8 hours and 20 minutes).

The encryption occurs as a background process, but it will take some resources while it runs, so if you are implementing this against a system with large databases where performance is critical then you will want to either run it in a period of quiet – or down time, or you will want to monitor to check that encryption isn’t impacting your system too much. Experience suggests that it shouldn’t be a problem unless your server is already under strain.

There are a few things to look out for if you are monitoring during the background encryption process:

  • CPU and IO, both these could take a hit
  • You may want to look out for blocking caused by encryption, you can do this by checking the sys.dm_tran_locks view where the resource_subtype is “ENCRYPTION_SCAN”
  • Monitor transaction log usage with DBCC LOGINFO. While the encryption scanner is running the transaction log can’t be truncated and VLFs marked for re-use. This could mean the transaction log might grow larger than normal, so you need to watch out if you are constrained for disk space.

What if you run into any performance problems during the scan?

First things first – DON’T TURN ENCRYPTION OFF!

If you turn encryption off, i.e.

ALTER DATABASE TestTDE SET ENCRYPTION OFF;

This isn’t going to stop the encryption scan, rather it’s just going to change direction. So now it will be decrypting everything it’s just encrypted, that’s likely to have just as much impact, and then sooner or you’re going to have to start again.

There is no “ALTER DATABASE TestTDE SET ENCRYPTION PAUSE;” command. There is however a trace flag (5004) that achieves the same thing.

If you enable the trace flag it will disable the encryption scanner:

DBCC TRACEON(5004);

When you do so, if you then query dm_database_encryption_keys you will see the database is set to a state of 2 (Encryption in progress) and the percent_complete column will show zero – as nothing is actually currently in progress.

When you wish to begin the scan again, you need to disable the traceflag and then set encryption on again (even though it’s not actually off):

DBCC TRACEOFF(5004,-1);
ALTER DATABASE TestTDE SET ENCRYPTION ON;

When you do this the scanner will pick up where it left off, i.e. if it had got to 50%, it will then continue from there.

It’s important to note that the traceflag doesn’t actually turn TDE off. In fact if you add new data, or update data and it get’s written to disk while the traceflag is enabled, the data will still become encrypted. It just stops the background process of converting all the data already on disk.

If performance impact is an issue for you with the scanning process, you can use the traceflag to incrementally run the scan out of hours, pausing it when you want to ensure maximum performance of your systems.

The traceflag seems to exist for internal use by SQL if the encryption scanner runs into a problem, for instance if it encounters corruption in your database files then it will halt and the traceflag will be enabled.

This could be useful to know if you find you tried to turn TDE on for a database but it’s stuck at “Encryption in Progress” but the percent_complete remains set to Zero. Check to see if the trace flag is enabled, and if you didn’t enable it then you may want to check for corruption in your database.

More Articles on TDE

What is Transparent Data Encryption?

Setting up Transparent Data Encryption (TDE)

Understanding Keys and Certificates with Transparent Data Encryption (TDE)

How Secure is Transparent Data Encryption (TDE) – and How to Prevent Hacking

Setting up Transparent Data Encryption (TDE)

You can set up Transparent Data Encryption (TDE) when you first create a database, or you can apply it to an existing database. In the latter case, once TDE has been enabled it will set to work encrypting your existing data in the background.

In either case the steps are the same. We’ll run through those quickly before going into more detail.

First you must have a Database Master Key (DMK) in the Master database, and a certificate that will be used by TDE:

USE MASTER;

CREATE MASTER KEY
ENCRYPTION BY PASSWORD = 'UseAStrongPasswordHere!£$7';
CREATE CERTIFICATE MyTDECert 
WITH SUBJECT = 'Certificate used for TDE in the TestTDE database';

This certificate is critical to you being able to access data encrypted by TDE, so you should make sure you back it up:

BACKUP CERTIFICATE MyTDECert   
TO FILE = 'C:\MyTDECert'  
WITH PRIVATE KEY   
(  
    FILE = 'C:\MyTDECert_PrivateKeyFile',  
    ENCRYPTION BY PASSWORD = 'UseAStrongPasswordHereToo!£$7'  
);

Then, in the database being encrypted with TDE you must create a Database Encryption Key (DEK) and specify the certificate:

USE TestTDE;

CREATE DATABASE ENCRYPTION KEY WITH ALGORITHM = AES_256
ENCRYPTION BY SERVER CERTIFICATE MyTDECert;

Finally, you turn encryption on:

ALTER DATABASE TestTDE SET ENCRYPTION ON;

And that’s all there is to it in practice.

A potential problem is that it is easy to set this up without really understanding it. Maybe that’s fine in many cases, but can you be sure that there’s nothing that can go wrong and have confidence that whatever the scenario, you will be able to get your data back? And can you be sure that your data is properly protected.

To gain that level of surety and to have confidence, I think it’s best to understand this in a bit more detail. In particular, I think it’s good to understand why each step is required and how the objects created are used.

So let’s go through those steps again in more detail.

Creating the Database Master Key (DMK)

CREATE MASTER KEY
ENCRYPTION BY PASSWORD = 'UseAStrongPasswordHere!£$7';

A DMK is used to protect other keys that are created in the database. It does this by encrypting them and only the encrypted value is stored. You can only have one DMK in a database.

The DMK itself is also stored encrypted, you can see when we created it we specified a password to encrypt it by. SQL Server also makes separate copy of the key encrypted by the Service Master Key (SMK). The SMK is the root level Key in SQL Server. This additional copy of the DMK means that SQL can access the actual value of your DMK without you having to specify the password again.

Creating the Certificate

CREATE CERTIFICATE MyTDECert 
WITH SUBJECT = 'Certificate used for TDE in the TestTDE database';

The certificate is going to be used in the next step down – to protect the Database Encryption Key (DEK) in your TDE enabled database. When you create a certificate, it contains an asymmetric key that can be used for encryption. An asymmetric key includes a public key that can be used to encrypt data and a private key that must be used if you want to decrypt data – that private key gets automatically protected (encrypted) by the DMK.

A logical question to ask is why we need the certificate? Why couldn’t we just protect the DEK in our TDE enabled database with the DMK from the master database directly?

Imagine the scenario that you need to migrate your database to another SQL Server instance. We can do this but we will need also to migrate the object that was used to protect/encrypt the DEK – which is itself stored in the database.

If TDE used the DMK to protect that then we would need to migrate the DMK to the new instance. But what if the new instance already had a DMK in the master database and objects that it was used to protect – such as other databases using TDE. At this point we would be stuck, we can’t migrate our DMK without overwriting the one that’s there, so we would have a choice, enable encryption for the migrated database but break it fore the existing ones, or vice versa.

Neither is a good option, but by having a certificate we can migrate that happily as we can have as many certificates as we want.

This also gives us the option that where we have multiple databases encrypted by TDE we could use a separate certificate for each. That means if one certificate is breached the others could remain protected.

This does raise a good point though that you may want one day to migrate your certificate, so call it something more meaningful and unique than “MyTDECert”.

Backing up the certificate

BACKUP CERTIFICATE MyTDECert   
TO FILE = 'C:\MyTDECert'  
WITH PRIVATE KEY   
(  
    FILE = 'C:\MyTDECert_PrivateKeyFile',  
    ENCRYPTION BY PASSWORD = 'UseAStrongPasswordHereToo!£$7'  
);

We backup the certificate in case we ever need to move or restore our database to a different server or SQL instance. It’s critical if we want to be able to access our encrypted data.

When you back it up, you specify a password to encrypt the key. What happens is SQL grabs the stored version of the Private Key (which is encrypted by the DMK) decrypts it, then re-encrypts it with the password. This means that you would be able to restore it to a different SQL instance where the DMK didn’t exist.

This covers us against the scenarios explained above regarding why we use a certificate rather than just relying on the DMK. It should also make it clear that if we need to migrate or recover the database all we need is:

  • The database backup
  • The certificate and key backups and the password used when creating them

Creating the Database Encryption Key (DEK)

CREATE DATABASE ENCRYPTION KEY WITH ALGORITHM = AES_256
ENCRYPTION BY SERVER CERTIFICATE MyTDECert;

In this, almost the final step, we create the actual key that will be used to encrypt the data. It’s stored in the database, but encrypted by the key associated with the certificate created in the previous step. The DEK is a symmetric key, i.e. the same value is used to encrypt or decrypt data.

It’s logical to ask why we don’t just use the private key from the certificate to encrypt the data directly. This is a bit more difficult to justify than the previous scenario with the DMK versus use of a certificate. It is certainly feasible that TDE could have been implemented this way.

One consideration is that the certificate is created with an asymmetric key – these are easier to work with in some ways as we only need the public key to encrypt data so can keep the private key concealed most of the time. Asymmetric encryption however is slower that symmetric encryption so to reduce the performance impact of TDE we want to use a symmetric key.

The concept of a DEK was new in SQL 2008 and created specifically for TDE. It makes sense that if we are to have a separate DEK then it should be stored in the database itself. That way migration/recovery is eased as it minimises the number of objects required.

It’s worth noting that you can only have one DEK in each database.

Enabling Encryption

ALTER DATABASE TestTDE SET ENCRYPTION ON;

In sys.databases you can see which databases have TDE turned on by looking at the is_encrypted column:

SELECT name
FROM sys.databases
WHERE is_encrypted = 1;

TDE sys_databases

We can find more details in the sys.dm_database_encryption_keys server view. Let’s query looking at some particular columns of interest:

SELECT
   d.name,
   k.encryption_state,
   k.encryptor_type,
   k.key_algorithm,
   k.key_length,
   k.percent_complete
FROM sys.dm_database_encryption_keys k
INNER JOIN sys.databases d
   ON k.database_id = d.database_id;

Here’s what I see after I created my DEK but before I enable encryption:

TDE_State1

We can see information about the DEK. We also see encryption state which describes the current state of the database. The main values you’ll see are:
1 = Unencrypted
2 = Encryption in Progress
3 = Encrypted

If I now enable encryption on this database and run the query again:

TDE_State2

We see that both my database and the TempDB database are now encrypted.

We also see the percent_complete column, which confusingly says zero. This column only has meaning when a state change is occurring. So, if the encryption state was 2 – then we would see a value here whilst the database was in the process of being encrypted. Here my database only had one row, so it was fairly instantaneous to flip encryption on.

This column becomes relevant when we are encrypting an existing database that has a reasonable amount of data, we’ll look at that next:

Encrypting an existing database with TDE

 

More articles about TDE

What is Transparent Data Encryption?

Understanding Keys and Certificates with Transparent Data Encryption (TDE)

How Secure is Transparent Data Encryption (TDE) – and How to Prevent Hacking

What is Transparent Data Encryption?

Transparent Data Encryption (TDE) was introduced in SQL 2008 as a way of protecting “at rest” data. It continues to be available in all versions of SQL right up until the present, though only in the Enterprise editions of SQL Server (though as with all other Enterprise only features, you can also work with it using Developer edition).

When we talk about “at rest” data we are referring to data that has been written to disk. In terms of our SQL databases that includes:

  • Any data files for our database
  • Any log files for our database
  • All backup files for the database, be they Full, Log or Differential backups
  • Database snapshot files
  • Any data written to disk in the TempDB database

The last item in that list, TempDB, needs to be included for completeness. Imagine that you query your database and as part of the query execution TempDB is used. If that data was written to disk then that creates a hole in our protection, someone could potentially read or copy the TempDB files and might see some of the data we are trying to protect. As a result when you enable TDE against any database on your SQL Server instance, the TempDB database is automatically encrypted as well to prevent this from happening.

Data “at rest” of course doesn’t include the following things:

  • Data loaded/stored in memory (buffer pool)
  • Data returned from a query and being passed across the network
  • Data received by a client as a result of a query

If you want to cover those scenarios as well then you need to look at other forms of encryption e.g. TLS and Always Encrypted.

There are also some less obvious exceptions which occur where SQL doesn’t use the buffer pool – and therefore there isn’t an in-memory version of the data:

  • Filestream data
  • Data persisted to disk using Buffer Pool Extensions

And there are a couple of other exceptions that can occur in certain circumstances:

  • Where the buffer pool gets paged to disk due to memory pressure
  • SQL dump files when there is a crash

That’s summarised in the below diagram:

TDE Encrypted vs Unencrypted

TDE mainly uses standard encryption protocols based on AES (Advanced Encryption Standard). When you set up TDE you can specify which AES algorithm you wish to use, AES_128, AES_192 or AES_256. In each case the number specifies the length of the key to be used for encryption in bits.

Obviously the longer your key, the harder the encryption should be to crack, however even for AES_128, estimations of how long it would take to break down the key by brute force vary between a thousand years, to numbers many times greater than the age of the universe – trillions of years. The difference is based on how we anticipate processing power to grow in the future. Even with the lowest estimates AES_128 should be sufficient in most scenarios but most people seem to go for AES-256 which should take the same time squared to be beaten.

Up to 2016, SQL also supported the TRIPLE_DES_3KEY encryption protocol. This is now generally not considered to be as secure as AES, and from SQL 2016 its use is deprecated. So, it is best if you stick to AES even if you are on a SQL version where DES is an option.

Let’s have a look at contents of some SQL data files so you can see the difference with and without TDE. I’ve created a database with a single table and inserted a row of data:

CREATE DATABASE TestTDE;
USE TestTDE;
CREATE TABLE dbo.SomeData(Id INT IDENTITY(1,1), SomeText VARCHAR(255));
INSERT INTO dbo.SomeData (SomeText) VALUES('This is my data');

I’ll close my connection from the database, and detach it so I can open the files in a Hex Editor. Then I search for my text in the data file:

TDE_MDF_File_Unencrypted

As you can see the data is stored clear as day in the data file.

Now let’s look at the same data file once TDE has been enabled. This time if I search for my string it’s not found and my data looks like this:

TDE_MDF_File_Encrypted

Even where the previous file was all zeros where there was free space at the end, the encrypted version also has those encrypted:

TDE_MDF_File_Encrypted_End

TDE Works by using an encryption key that is stored in the database being encrypted – but that key is itself stored encrypted by an object outside of the database. We’ll see all the various objects involved when we look at setting up TDE next:

Setting up Transparent Data Encryption (TDE)

 

More articles on TDE:

Encrypting an existing database with TDE
Understanding Keys and Certificates with Transparent Data Encryption (TDE)
How Secure is Transparent Data Encryption (TDE) – and How to Prevent Hacking

How Secure is Transparent Data Encryption (TDE) – and How to Prevent Hacking

TDE is commonly described as “at-rest” encryption, i.e. it protects your data wherever it is stored on disk. This includes the database files, any backups taken (including Log and Differential), and any data that may get temporarily persisted to TempDB (when you use TDE to encrypt any database on an instance TempDB will get automatically encrypted also).

TDE does not however give any additional protection against those accessing data by querying the database. If you have access to the database then TDE transparently allows that to continue without any change in the functionality.
Understanding that, if we are implementing TDE we can only expect it to fully protect us from people who have access to the file system but do not have access to the data stored through any other avenue – for instance by querying SQL Server.

In most scenarios we probably don’t plan on there being such a person, i.e. we don’t grant the world read access to our database servers and say “go on, feel free, have a play”. In most production systems access is tightly tied down, and if you work in an environment where this is not the case then you probably have things you should address before you start looking at implementing encryption.

In reality though, mistakes or oversights happen, or sometimes malicious parties are able to gain access they should not have. What TDE offers us is, like many other forms of encryption, an additional layer or protection. The first line of defence gets breached, but there is still the next one to get past.

There is also the scenario of database backups, where these might be stored off the database server and even offsite. TDE gives us some assurance that in these scenarios the data is protected irrespective of the access restrictions in force in those other places. We’ve all read of breaches which were enabled by a database backup being stored on a location that was easier to access than it should have been.

Of course this protection of the backups could simply be achieved by using backup encryption – which is a separate feature of SQL Server and does not have the performance overhead associated with TDE.

So, if we are specifically implementing TDE rather than just backups it might be that we are specifically trying to protect ourselves from someone gaining direct access to the database files. That could be seen as quite an edge case – but that doesn’t mean it isn’t worth protecting against.

One thing to remember with TDE is that it is not (on its own) going to protect you against someone who has Admin right either over the box where SQL Server is installed, or over SQL itself. A local Admin can add themselves as an administrator on SQL – though they will need to restart the instance to do so. And an administrator on SQL can do pretty much whatever they like, including exporting a copy of the certificate and key used by TDE so that they can use that to read stolen data.

Still, there is some level of protection gained from TDE and if you combine it with tight access controls, and auditing, then it can help you reduce the scenarios where your data is at risk.

Hacking TDE

There is a great post and accompanying video out there by Simon McAuliffe that shows a method that can be used to break TDE:
https://simonmcauliffe.com/technology/tde/

Simon’s not a big fan of TDE and he makes a good argument, the reality is that a lot of the time people are using it to tick a box “yes, all our data in encrypted at rest” and no-one questions the particular scenarios where it is – and is not – offering protection.

While I agree with Simon’s points, I don’t feel as strongly as he does. You just need to be making an informed decision to implement TDE based on a full understanding of what it can and cannot protect you from. It’s not a magic bullet, but it’s not totally useless either.

Nonetheless the method Simon outlines for breaking TDE is concerning as it only requires basic level read access over the server to be able to grab all the objects you need to crack a TDE protected database. Sure, you need to have a certain knowledge base to be able to do that, but protection that is based on an attacker not having the right skillset, is not of the greatest value.

I won’t go into Simon’s method in full detail – you can check out his post for that. But I’ll give you enough of a summary so you can hopefully understand the principles involved.

What he details is an inevitable vulnerability that he specifies would not be unique to TDE – but common to any similar technology.

It’s back to the “T” of TDE again, “Transparent”. The concept of TDE is that it is fully self-contained and managed in the background for you by SQL Server. It doesn’t require you to supply additional passwords or keys when querying data, and it doesn’t need to access any objects stored outside of the SQL instance.
Let’s look again at the diagram with the encryption key hierarchy:

TDE_Key_Hierarchy

Starting at the bottom, each key is protected by the one above it. But what happens when we get to the top? There is nothing left to protect the uppermost DPAPI keys, so they must be stored unencrypted somewhere on the system.

You simply cannot get around that, SQL needs to be able to – all on its own – encrypt and decrypt data. Even if we say we’ll encrypt the DPAPI keys, what will we encrypt them with and where will we store that object? You can’t get away from the fact that at some point we need to store an unencrypted key, or password, or whatever, somewhere.

As a consequence, it’s remarkable easy to pick these root DPAPI keys up, though you need a few basic hacker skills/tools to be able to do anything with them.

Once you have the DPAPI keys, you can use those tools – in combination with a copy of the database files (or backup) from the Master database to obtain the SMK and thus break encryption all the way back down the hierarchy.

So what has just happened? A skilled and motivated hacker with mere read permissions to your server was able to steal your files and decrypt your data.

Just the scenario you implemented TDE to prevent.

They keys are held in the following directory on most Windows systems:
C:\Windows\System32\Microsoft\Protect\S-1-5-18

And by default, as long as you have read access to the file system you can read them.

Preventing this Hacking of TDE

Fortunately there is a basic action you can take to mitigate this. Strangely, this step isn’t mentioned (as far as I know) in any of the other documentation about setting up TDE.

The solution is that when you are setting up TDE you should take an additional step. Very simply, you need to secure this directory so that only Local Administrators to the server, the LOCAL SYSTEM account, and the SQL Server service account can read anything within it.

Once that’s done TDE should be offering you what you expect, i.e. casual read access to the server will not allow people to read your data. Only administrators would be able to gain access to the data, and as we’ve discussed already – Admins can access your TDE protected data anyway.

The only downside of doing this is that you are restricting access to the DPAPI (Data Protection API) from any other accounts. If you are running a pure SQL server then this should not be a problem – i.e. the server is for an installation of SQL and nothing else. If you also have application services running on the same box though, and these services wish to use encryption, then you may need to extend the permissions assigned to that directory. Though a better choice would be to move these services off the box.