Indexes and Statistics with Always Encrypted

In a previous post we looked at executing queries against columns encrypted using Always Encrypted. In this short post we look at the ability to be able to have (or not) indexes on those columns. This information is relevant to the case where you are using Always Encrypted without enclaves, we’ll look at working with enclaves later on.

Where we want to query with a predicate, it is often useful to have an index on the column being searched. You can only have indexes on columns using deterministic encryption – not randomized. This makes sense as a column with randomized encryption is going to have different values stored even when the underlying plaintext value is the same, so an index would be meaningless. You also can’t perform comparisons on such columns so an index wouldn’t be useful. An index on a column deterministically encrypted with Always Encrypted will simply be an index of the encrypted values and as such is only useful for equality comparisons, not for sorting or range queries – though neither of these functions are possible against our deterministically encrypted column anyway.

To create an index on a column using deterministic encryption is the same as if the column is not encrypted. Here is an example using the database and table created in the previous posts Setting Up Always Encrypted and Executing Queries Using Always Encrypted. In our table dbo.EncryptedTable we have the columns LastName (which is encrypted using deterministic encryption), and FirstName (which is encrypted using randomized encryption):

CREATE NONCLUSTERED INDEX IX_LastName
ON dbo.EncryptedTable(LastName);

While we can’t index on our columns using randomized encryption, we can include them in the leaf pages of an index so that their values do not have to be retrieved using a Key Lookup operation. So the following SQL would also work:

CREATE NONCLUSTERED INDEX IX_LastName_Include_FirstName
ON dbo.EncryptedTable(LastName) INCLUDE(FirstName);

In common with any other index, a statistics object is created that maintains information about the distribution of data in the index. We can view the data held in the statistics object with the following command:

DBCC SHOW_STATISTICS('dbo.EncryptedTable','IX_LastName');

Here we see the output:

We only have a couple of rows in our table, but if you’re familiar with statistics objects, you’ll see this is pretty much the same as we would see where there is no encryption. What is interesting is to look at the last result set. Normally this would show actual data values in the RANGE_HI_KEY column; here we can see that we have an encrypted value. From this, we can understand that the statistics are based on the encrypted values. This underlines the point already made, that our index is an index of the encrypted values of our data. That makes sense as SQL Server is never aware of the unencrypted values, so it wouldn’t be possible for it to create or maintain indexes (or statistics) of the plaintext values.

This is different when using always encrypted with enclaves where indexes on columns with randomized encryption are allowed. In that case, those indexes, and associated statistics objects, are actually based on the plaintext values. As mentioned, we’ll talk about how that works in a later post.

This post is part of a comprehensive series on SQL Server Encryption. Subscribe to my blog for updates as new posts are published or you can buy my book through the link in the sidebar to get it all in one go.

And if you’re embarking on an encryption project (or anything else to do with SQL Server) and want some help, I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

One thought on “Indexes and Statistics with Always Encrypted

Leave a Reply