As a quick reminder, when you have a 3 x 3 grid, a magic square is one where each row, column or diagonal add up to the same value. e.g.
All the rows, all the columns and both diagonals each add up to 15.
In the previous puzzle I asked you to write SQL to find all the possible 3 x 3 magic squares. After originally coming across that idea I discovered all sorts of variants on the magic square theme.
This time, I’ll ask you to find write SQL to find just one 3 x 3 magic square, but in this case one composed only of prime numbers? They don’t have to be consecutive primes (though they do all have to be different), and to make things easier you can treat the number 1 as a prime if you want.
I’ve been taking a bit of a deep dive into understanding Transparent Data Encryption (TDE). As part of that I’ve been reading a lot of blog posts, stack overflow answers and technical documentation to try and deepen my knowledge.
Within that I’ve found a lot of contradiction I’ve needed to overcome. In particular this has been around what objects you need to recover an encrypted database to another SQL Server – be that when you’re doing a straight restore, working with log shipping, or using Availability Groups.
Most of the solutions offered work, but many describe additional steps that are not necessary. I feel that part of the problem is that people are misunderstanding the basics about encryption keys in SQL Server, so I thought it would be worthwhile going over that in a bit of detail before digging deeper into TDE in general.
I think it’s important to understand this stuff clearly, because then you have a clear view of when you’re protected and when you are vulnerable. If we’re engaging in encryption then we clearly have a desire for security – to be sure of that we have to be clear in our understanding.
The Key itself – Usually can be thought of a number expressed in binary format. Long and random enough to make it difficult to guess even by brute force attempts.
Another object that’s used to protect the key – This object might be another key, it might be a certificate, or it might just be a password. This object is used the encrypt the key.
The encrypted value of the key – Formed from the original value of the key, encrypted by the protecting object.
In SQL we rarely (maybe even never) see the actual value of the key. We have the encrypted version and we usually know what object was used to encrypt it. That second object may even be another key that is itself encrypted by a third object.
When it comes down to it though, the actual thing that is used to encrypt or decrypt data is the Key itself, not the encrypted value, and not the hierarchy of objects that may have been used to protect it.
So, all I really need to read your data, is your key.
Let’s look at that in the context of TDE. Here’s a standard diagram from books online that shows the hierarchy of encryption for TDE. There are other ways of working with TDE but this is the standard:
So, right down at the bottom of the diagram is the Database Encryption Key (DEK). That is what is used to encrypt/decrypt the data in the database. It sits in the database itself. So when you backup the database the DEK is held in the backup.
BUT – the DEK is itself encrypted by the certificate that sits in the master database, so even if someone has your backup they can’t access the key – and nor can any system they try and restore your backup to. So they also can’t access your data.
To be able to decrypt the DEK, the certificate is required. In fact, what is really required is the private key associated with the certificate as that’s what’s used for encrypting stuff. The private key however is itself encrypted by the Database Master Key (DMK) that sits in the master database.
So you’ll be forgiven at this stage for thinking that in order to read our backup of the database, we need the backup (containing the DEK), the Certificate (include the private key) and the DMK.
When you realise that the DMK is itself encrypted by the Service Master Key(SMK), and that the SMK is also encrypted then you might think you need to include those too – and whatever encrypted the SMK.
Where will it end!?
In reality we just go back to our certificate and its associated private key.
Let’s say I have a certificate called MyTDECert. I can (and must) backup this up outside of the database. If this is lost, then so is my data. The command for backing it up looks something like this:
BACKUP CERTIFICATE MyTDECert
TO FILE = 'D:\Temp\MattTest\MyTDECert'
WITH PRIVATE KEY
FILE = 'D:\Temp\MattTest\SQLPrivateKeyFile',
ENCRYPTION BY PASSWORD = 'C0rrecth0rserbatterystab1e'
When I back this up I specify a password to encrypt the Private Key. Remember that the Private Key was already encrypted by the DMK? Well, this backup certificate command doesn’t just encrypt it a second time – what would be the point of that? No, the reason I need to supply a password is that the command retrieves the unencrypted version of the private key, re-encrypts it with the password INSTEAD and then that is what gets save to disk.
Remember that at the point I run the command SQL has access to all the objects, all the way up the chain, that are used for the encryption. So it has no problem getting the actual value for the Private Key.
Now, when I restore that certificate to – let’s say another instance of SQL Server)- the command looks like this:
CREATE CERTIFICATE TestSQLServerCert
FROM FILE = 'D:\Temp\MattTest\MyTDECert'
WITH PRIVATE KEY
FILE = 'D:\Temp\MattTest\SQLPrivateKeyFile',
DECRYPTION BY PASSWORD = 'C0rrecth0rserbatterystab1e'
Considering what the BACKUP command did, you can intuit that the above command will do the opposite. It first of all decrypts the Private Key using the password supplied. Then it encrypts it again using the DMK for the local instance before saving it in the master database locally.
Remember it is the Actual Values of Keys that get used for encryption – not their encrypted value. So the private key for this certificate is exactly the same as it was where we backed it up from – though the encrypted value will be different as it has used a different DMK.
So if I now go to restore a copy of a database whose DEK has been encrypted with this certificate onto the new instance I can do so without any problem.
Many articles will tell you that you also need to migrate the DMK from your old instance, and some will say that you also need the SMK. This is simply not correct – I mean, it will work if you do that (in the right order) but it isn’t necessary.
And in some scenarios it may simply not be possible. Imagine that the instance you are migrating to is already using encryption based on its current SMK and DMK, if you replace those with the ones from the instance you are migrating from then you are going to break existing stuff.
So all you need to restore a database protected by TDE is the database backup, the certificate/private key, and the password specified when the certificate and key were backup up.
Equally that’s all someone else needs too, so make sure those things are protected.
In my next post we’ll look at a possible way of hacking TDE and the additional steps you need to protect yourself to make sure TDE is giving you the level of protection you expect. When that post is live you should be able to see a link to it in the comments.
More and more people are considering some level of encryption against their data stored in SQL Server. In many cases it might be considered that other measures such as firewalls, well defined access permissions and application code free of security flaws, already offer sufficient protection.
Anyone who follows the tech news though will know that attacks and data breaches are common. Encryption often provides the second line of defence, i.e. even if someone malicious gains access to your data, they may not be able to read what they find there.
Over the last 10 years, the number of features available in SQL Server for working with encryption has grown. It can be confusing to understand what the differences are, which you should implement and what exactly each feature protects you from. In most cases these are complementary technologies and if you are getting serious about encryption you may choose to implement more than one. In the latest versions of the SQL Upgrade Advisor you will be recommended by default to consider Transparent Data Encryption and Always Encrypted, and it is (or should be) standard practice to be using TLS.
Often it seems to be the case that people implement some form of encryption to “tick a box”. If you are really serious about protecting your data then you really need to understand what each feature does – and does not – protect you against.
One thing that may influence your decision on what to use is what is available in the versions and editions of SQL Server that you have in production. Here’s a quick comparison, I’ve missed out versions where no new features were added and for 2016 I’ve specified SP1 as a lot of features changed from Enterprise only to being available in Standard – so there’s no good reason for not being on SP1 if you’re using 2016:
We’ll quickly run through the various technologies listed above with brief detail on each.
Column Encryption (2005)
Uses the Encryption Hierarchy and built-in Cryptographic Functions to allow you to encrypt values and store them in the database. Ability to read the encrypted data is based on permission or access to the objects used to perform the encryption e.g. a pass phrase, a certificate or an encryption key.
It can be easy to misunderstand what this feature is, it is not that you configure a column as being encrypted, but rather that the Cryptographic Functions combined with the Encryption Hierarchy in SQL Server allow you to encrypt specific values, which can then be stored in the database.
TDE (2008) Enterprise Only
TDE (Transparent Data Encryption) is configured at the database level and is defined as “at-rest” encryption. This means that the data and log files for your database (as well as backups) are encrypted on disk. One of the key advantages to TDE is that it doesn’t require any code change to implement.
TDE doesn’t protect you against anyone who is able to read data directly in the database, and an administrator on the SQL Server instance or the server hosting it can fully circumvent it. It simply protects you against parties who may gain access to the files.
Without the certificate and key used for TDE they would not be able to restore a backup to another server, and they would not be able to directly read data in the database files – which otherwise are in a relatively readable format for someone sufficiently motivated.
One thing to note with TDE is that it is only available in the Enterprise Editions of SQL Server.
Those of you who follow the major SQL Blogs may have come across this article which points out a seemingly big flaw in TDE that allows someone with minimal privileges against your server (read is sufficient) and the right skillset, to be able to extract your keys and read the data: https://simonmcauliffe.com/technology/tde/
It seems that you can mitigate this by making sure access is restricted to the directories holding the relevant keys:
If only administrators (and the SQL Server service account) can access this directory then you should be safe.
TDE does add extra processing overhead to SQL. 3-10% is often quoted but that will vary with your workload so if this is a concern you should test – some people consider that the overhead is not worth the benefits.
Backup Encryption (2014)
Backup Encryption is what it sounds like, just your database backups are encrypted. All types of backup are supported – Full, Differential and Log.
This isn’t a setting you turn on for your database – you have to specify when you make a backup that you want it protected. As such this doesn’t protect you against someone making an ad-hoc backup and storing it on a USB device which they then leave on a train – or any other similar human error.
One nice thing is that you don’t have to do anything special when restoring the database back to its original home. Backup encryption (generally) uses a certificate or key stored in the master database so as long as you haven’t lost that, the encryption part of a restore process is seamless. If you need to restore elsewhere then you’ll need a copy of the certificate or key.
Backup encryption will suffer from the same limitations as TDE (except that it will only add CPU overhead when taking backups), however if your intent is to make sure backups stored off the database server are protected then this should be sufficient.
Always Encrypted (2016)
Always Encrypted (AE) is implemented at the column level. Data is stored encrypted on the disk, in memory and when being passed to a client application. Encryption is based on the combination of a key and certificate, where the certificate is usually stored on a client machine and not stored on the database server. This means that access (even full sysadmin access) to the database server on its own is not sufficient to be able to read the data.
Implemented correctly AE can be very secure, however it also has a number of logical limitations that mean it can be challenging to interact with the encrypted data. For instance you can’t use LIKE comparisons against encrypted columns.
How big an impact this is for you depends on how your application is coded and the sort of columns you want to encrypt. It could be straightforward and require no code change – or it could entail major rework.
Dynamic Data Making is not actually encryption but is another way that people may choose to protect data from prying eyes.
This provides a method of hiding data from non-privileged users without having to change any code.
Image storing a credit card number. We might want to remember a customer’s card details so that it is easy for them to make payments in the future. Equally though we don’t want people to be able to easily view it. Often we’ll see that just the last 4 digits are displayed so that we have enough to verify it is the correct card. Dynamic Data Masking provides a way of doing this. We can define a mask, so that for most users who have permission to view the data the number:
Would only appear to them as:
We can then also define specific roles that can access the full number, for instance one assigned to a service account used for processing payments.
That sounds great, but in reality the protection is not that great. If I have read access to the database, but am in a non-privileged role then it would still be possible for me to get around the masking with a bit of querying. This is due to the requirement that existing queries should continue working, so the SQL engine can see the underlying data even if I can’t and will respond to me attempting comparisons against the data.
Transport Level Security (TLS)
TLS is a protocol used to protect data as it travels across a network. It is fairly equivalent to SSL (Secure Sockets Layer) which is used to protect data between a web server and a browser. Without TLS, data from SQL is sent unprotected across a network and can be intercepted by what is known as a Man-in-the-Middle attack.
Such attacks function by impersonating the parties involved. Imagine if we have the SQL Server instance and a client application. A communication starts from the client to the server, the attacker impersonates the server and intercepts the data or request sent from the client, it then can alter the message before passing it on to the server – which believes it is receiving it directly from the client. Then it will intercept any data being send back, either just to capture it, or to alter it, before transmitting that back to the client which believes it is in direct and private communication with the server.
TLS creates an encrypted connection and encrypts the data sent across the network and so the information passed back and forth cannot be read by any party other than the sender or receiver.
TLS also allows both the sender and receiver to verify each other so the Man-in-the-Middle attack is doubly foiled.
Hashing and Salting
Hashing is a method of taking a value (usually a string such as a password) and transforming it via a Hashing function to a new value that in theory cannot be reverse engineered to find the original value.
When someone enters a password into the system that value can be put through the same hashing function and the resulting hashed value be checked against the value stored in the database to verify that they match.
In practice is would be possible for someone to maintain a list of what the hashed values are for a particular hashing function and thus be able to recognise where common values have been used.
This where salting comes in, a second value known as the salt is generated and combined with the value to be hashed. Then the salt is stored unencrypted alongside the resulting hashed value. When we then want to verify (for instance) a password, we simply combine the entered value with the salt before hashing and comparing. If the salt is unique for each value we hash there’s no practical way of maintaining a dictionary of hashed values.
SQL has a series of hashing functions but HASHBYTES is the one most commonly used for this process.
That’s the set of tools available. The below table shows a quick comparison of what each protects you from and any disadvantages:
No solution you pick is going to be secure and safe just by implementing one tool or another. Of key importance is process. Encryption generally involves keys of one form or another and these must be protected. Your house may be secure if you have good locks, but not if you leave the keys outside on the doorstep.
Equally you need to think about what you are attempting to achieve by implementing a particular tool and make sure that the way your organisational roles and processes are structured supports this. For instance Always Encrypted is often touted as even being able to protect against a rogue DBA. The DBA may have full access and be able to do whatever they like to the SQL instance, but for AE to read protected data you also need a certificate that is stored on application servers. If you want this level of protection then you need to make sure that you have sufficient role separation such that the DBA can’t access the certificate, and those who are admins on the application server can’t access the database. Or at least not without triggering auditing.
Finally, remember that any system is only as strong as its weakest link. There’s no point encrypting your data if the application accessing it is so full of holes it will leak it all out anyway, and it’s often said that the weakest part of any system in the human. The point of having data is that we want to be able to access it, so there must always be a point where the data is viewed/or interacted with in an unencrypted form.
At the end of the day the only way of absolutely ensuring there is no way for someone to access data they shouldn’t – is to have no data in the first place.
But in practice, you can at least try to make it hard for them.