Parameter Sensitive Plan Optimization in SQL Server 2022. What is it and is it any good?

June 8, 2023June 15, 2023 ~ Matthew McGiffen ~ Leave a comment

What is it?

Parameter Sensitive Plan (PSP) optimization is a new feature in SQL Server 2022 that aims to improve the performance of parameterized queries. It is part of the Intelligent Query Processing family of features and addresses the scenario where a single cached plan for a parameterized query isn’t optimal for all possible incoming parameter values. This is the case with non-uniform data distributions.

PSP optimization automatically enables multiple, active cached plans for a single parameterized statement. Cached execution plans will accommodate different data sizes based on the actual provided runtime parameter value(s).

During the initial compilation, column statistics histograms identify non-uniform distributions and evaluate the most at-risk parameterized predicates, up to three out of all available predicates. In other words, if multiple predicates within the same query meet the criteria, PSP optimization chooses the top three.

For eligible plans, the initial compilation produces a dispatcher plan that contains the PSP optimization logic called a dispatcher expression. A dispatcher plan maps to query variants based on the cardinality range boundary values predicates.

Here’s an example to illustrate how PSP optimization works:

Let’s say we have a table Orders with columns OrderID, CustomerID, OrderDate, and TotalAmount. The table has a non-uniform distribution of data, with some customers having many orders while others have only a few.

We have a stored procedure that retrieves all orders for a given customer:

CREATE PROCEDURE dbo.GetCustomerOrders (@CustomerID int)
AS
BEGIN
   SELECT * FROM dbo.Orders WHERE CustomerID = @CustomerID;
END

Without PSP optimization, SQL Server would generate a single execution plan for this query, based on the parameters supplied the first time it is called, and cache it for reuse. However, this plan may not be optimal for all possible values of @CustomerID.

With PSP optimization enabled, SQL Server will generate multiple execution plans for this query – one for low cardinality ranges, one for medium cardinality ranges, and one for high cardinality ranges. At runtime, SQL Server will choose the optimal execution plan based on the value of @CustomerID.

For example, if @CustomerID corresponds to a customer with many orders (high cardinality), SQL Server will choose the high cardinality range execution plan. If @CustomerID corresponds to a customer with few orders (low cardinality), SQL Server will choose the low cardinality range execution plan.

This allows SQL Server to determine the optimal execution plan based on the parameter value(s), improving query performance.

PSP aims to reduce parameter sniffing, a situation where SQL Server generates an execution plan based on the first parameter value it encounters and then reuses that plan for all subsequent parameter values. This can lead to suboptimal performance for some parameter values.

In SQL Server 2022, PSP optimization is automatically enabled by default. However, it can be disabled at the database or query level using the following T-SQL statement:

ALTER DATABASE [database_name] SET PARAMETER_SENSITIVE_PLAN_OPTIMIZATION = OFF;

Is it any good?

The biggest problem in terms of fixing performance is that cardinality estimation is based on statistics objects which are effectively a histogram (with up to 200 steps) of data distribution across a certain column. In the example above, if we have statistics on CustomerID then we may have accurate figures if we have less than 200 customers, but if we have a lot more then they will be grouped in ranges and we may have a lot of data skew within a range, e.g. CustomerID 1 may have 10 orders where CustomerID 2 has 10,000. That means PSP optimization may still not always choose the correct plan.

Also, there are lots of parameter sniffing scenarios which can depend on a lot more than 3 parameters – such as “kitchen-sink” search procedures where you can search data based on one or more of many different parameters.

That said, there will be many scenarios where it does help – it’s just not a magic bullet to fix parameter sniffing.

Brent Ozar talks about another issue in his post How SQL Server 2022 Tries to Fix Parameter Sniffing. That is the issue around monitoring and performance troubleshooting. When you try to view the execution plan for a stored procedure where the optimizer has decided to use PSP optimization, you only see the dispatcher plan, not the actual plans that will be used in practice. Also, should your monitoring or performance troubleshooting tools flag up a query that is causing problems, where that query is using PSP optimization, it is difficult to link that query back to the stored procedure it belongs to. Not impossible, but it is a pain.

I attended a talk by Conor Cunningham, Principal Software Architect at Microsoft on SQL Server, a few years back at SQL Bits, where he was presenting the original set of features collectively known as Intelligent Query Processing. He pointed out that a lot of people running SQL Server, especially on Azure with Azure SQL Database and Azure SQL Managed Instance, don’t have SQL Server performance tuning expertise in house and these sorts of features were targeted at these people most of all. Features that will not solve all performance issues of a given type, but will certainly help in a number of circumstances.

From that point of view PSP optimization is a good addition to the set of features, even if it doesn’t do as good a job as someone with real expertise might, and even if it does come with it’s own problems.

Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

What is Always Encrypted and how does it work?

June 5, 2023June 5, 2023 ~ Matthew McGiffen ~ 1 Comment

Always Encrypted was a new encryption feature added to SQL Server with the 2016 version of the product. Initially it was just available in enterprise edition, but from SQL Server 2016, SP1 was made available in standard edition also.

Unlike TDE which encrypts the whole database, Always Encrypted is a form of column encryption that means you choose which columns of data you want to encrypt. The “Always” part of Always Encrypted refers to the fact that data is encrypted at rest, in memory, and as it is transmitted across the network. That means that it provides the highest level of protection possible for your data.

The beauty of Always Encrypted, and what makes it a great feature, is that you don’t necessarily need to make any code changes to use it. Neither your application side code nor your database side code – stored procedures/functions – needs to know which columns are encrypted or not. Encryption and decryption of data is carried out automatically for you. It does however come with limitations on how you can interact with data.

A key difference between Always Encrypted and existing forms of column encryption available in SQL Server is the encryption hierarchy used. Previous methods would rely on the encryption hierarchy within SQL Server. That would mean that a user with elevated permissions on the database server would be able to read encrypted data. Always Encrypted relies on a key stored in the database server as well as a certificate and associated asymmetric key pair that are only stored on the application server. That means you need elevated access on both servers to be able to read encrypted data.

A key concept with Always Encrypted is the idea of role separation. That refers to the idea that you should ensure there is a separation of roles between your system administrators. If different individuals (or teams) manage your application and database servers, and no individuals have admin rights on both, then no one can access encrypted data outside of the context it should be accessed. Even without role separation, the access requirements to read encrypted data make it much harder for an external attacker.

SQL Server 2016 vs. SQL Server 2019 and Beyond

In SQL Server 2019 new functionality was added to Always Encrypted that addressed many of the limitations in the previous version. To access the extra functionality however requires additional setup and management. I find it’s worth thinking about these as almost two separate versions of the same feature, and you may choose which to use depending on your requirements and circumstances. We can think of these as “Basic Always Encrypted” which is mainly what we got with SQL 2016 and “Always Encrypted with Enclaves” which was the main addition with SQL 2019.

The difference arises from the reason why we have limitations with Always Encrypted at all. A key thing to understand is the “Always” part. SQL Server never sees the unencrypted value of your data; encryption and decryption is actually carried out on the client side – on the application servers. As SQL Server is unaware of the unencrypted values of data, it is limited in the questions it can answer about your data.

With SQL 2019 the use of enclaves was added. An enclave is a secure partition within memory where protected operations can take place securely. Always Encrypted with Enclaves allows for decryption and encryption of data within a secure enclave on the SQL Server box which can therefore support a wider range of functionality.

In this post and the following ones, we’re going to first focus on basic Always Encrypted and then move onto the version with enclaves later on.

How Does Always Encrypted Work?

As mentioned, we’re going to focus on the basic version to begin with. The concepts around Always Encrypted are reasonably straightforward, but you’ll find yourself asking questions that seem tricky. Will Always Encrypted work if I try to do this or that? Why are you getting a particular error? I’ve blogged before about Always Encrypted and some of the specific questions readers ask me aren’t covered in the documentation; having a knowledge of how it all works generally enables me to answer those questions. If you understand the mechanics of encrypting and decrypting data, then any limitations make sense. As such, I’m going to spend quite a bit of time in this post explaining how it all works.

Encryption Hierarchy

SQL Server has a standard encryption hierarchy that we’ve seen when we looked at TDE and Backup Encryption. Always Encrypted doesn’t use that, mainly because the encryption activities don’t occur on the database server, as they are done on the client machine.

On the SQL Server side we have a Column Encryption Key (CEK) that is used to encrypt and decrypt data. That is stored encrypted in the database. The CEK is encrypted by the Column Master Key (CMK) which is actually a certificate and key pair that commonly sit on the client machine – usually an application server.

On SQL Server we do have a CMK object, but this is just a pointer to the location of the actual CMK.

You can have one CEK for all the encrypted columns in your database, or you could have one for each separate column – or something in between. The CMK can be shared by multiple CEKs so again you could have one or multiple CMKs.

Encryption in Practice

Let’s go through the process of how Always Encrypted works. The below image shows all the steps that occur when you issue a query.

At first glance it seems like a lot is going on, but we’ll go through each step in turn and you’ll see it is quite simple.

Issue Query
The application will have created a connection to the database server. It has to specify that the connection will use Always Encrypted. Potentially, that is the only change you need to make to your application – if you’re lucky. It depends whether you end up needing to do work to get around the natural limitations of the technology – but we’ll get to that later. Once the connection is in place, then the application simply issues the query in the normal manner.
Request Encryption Metadata
When issuing a query, any plaintext values that target encrypted columns must be passed as parameters and must be encrypted before being sent to the database. Where such a parameterized query is being executed, the client driver requests the relevant encryption metadata from SQL Server to understand what encryption activities must be carried out before the query is sent to be executed.
Return Encryption Metadata
SQL Server parses the query text and identifies if there are any columns targeted by parameters that are the subject of column encryption using Always Encrypted. This information is sent back to the client, along with the encrypted values of any Column Encryption Keys (CEKs) used and in each case the location of the Column Master Key (CMK). This encryption metadata is cached locally on the client machine so that repeated calls to get the metadata do not need to be made for the same query.
Request CMK
Using the details provided in the encryption metadata, the client driver makes calls to access the certificates and keys for the CMKs it requires. Usually these are stored in the certificate store on the application server. Where there are multiple application servers, then they need to be stored on each. You can also store your CMKs in an external store such as Azure Key Vault, we’ll cover that in a later post.
Return CMK
The CMKs are received by the client driver.
Encrypt Parameters
Any parameters that target encrypted columns need to be encrypted. The CEK that was returned by SQL Server for a given column is decrypted using the associated CMK; the parameter value can then be encrypted using the underlying key value. Only parameters will be encrypted, so if you have literal values in your query, then these will be a problem if they target encrypted columns. Update and insert queries must specify the values using parameters for them to function when targeting tables with Always Encrypted columns.
Issue Query
The query with its parameters encrypted by the last step is sent to SQL Server to be executed.
Return Encrypted Results
Where the query has results, such as with a select query, the results are returned. Where these include encrypted columns, then it is the encrypted values that get passed back. Encryption metadata is sent back alongside the result set which supplies the encrypted CEKs and CMK locations for any encrypted columns in the results.
Decrypt Data
Where we have results containing encrypted columns, these are decrypted by the client driver before being returned to the application. This is done using the encryption metadata supplied alongside the results. As with the parameters, the CEK is decrypted using the associated CMK – sometimes this has already retrieved from the certificate store, and if not then there is an additional step to retrieve the CMKs. The unencrypted value of the CEK can then be used to decrypt the data in the column or columns that it is related to.
Return Results
Finally, the results are returned to the application with values that were stored encrypted returned as the plain text version.

Summary

That can feel like a lot of steps to get to your data, but it all happens in the background and generally with little overhead. Each step is itself quite simple. It’s good to understand the overall flow so you can understand what will happen in different querying scenarios.

We’ll go over some of this again in a bit more detail when we have Always Encrypted set up and start looking at how you execute queries. Having seen the process though, we can now start to understand some of the key things about Always Encrypted:

Data only exists unencrypted after it hits the client.
SQL Server does not encrypt or decrypt data. That is all handled in the client driver. As such there is no way for SQL to know what the unencrypted values of your data are – it just sees an encrypted string.

This post is part of a comprehensive series on SQL Server Encryption. Subscribe to my blog for updates as new posts are published or you can buy my book through the link in the sidebar to get it all in one go.

And if you’re embarking on an encryption project (or anything else to do with SQL Server) and want some help, I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

What is TDE (Transparent Data Encryption)?

December 19, 2022May 10, 2023 ~ Matthew McGiffen ~ 2 Comments

TDE is referred to as a “transparent” form of encryption. What that means is that the process of encrypting and decrypting data is fully performed in the background. The queries we write to access data are unchanged whether TDE is enabled or not. So, enabling TDE has no impact on application functionality, does not require refactoring of code, and is therefore relatively easy to implement. TDE encrypts all the data in a database, so you don’t need to choose which data items to encrypt.

TDE allows us to encrypt “at rest” data. When we talk about “at rest” data we are referring to data that has been written to disk. TDE works at the IO level, encrypting data automatically as it is written to disk, and decrypting it as it is read from disk.

In terms of our SQL databases the assets that are protected include:

Any data files for our database.
Any log files for our database.
All backup files for the database, Full, Log or Differential backups.
Database snapshot files.
Also the TempDB database data and log files are encrypted.

The last item in that list, TempDB, needs to be encrypted for completeness of protection. Imagine that you query your database and as part of the query execution TempDB is used. If that data were written to disk, then that creates a hole in our protection, someone could potentially read or copy the TempDB files and might see some of the data we are trying to protect. As a result, when you enable TDE against any database on your SQL Server instance, the TempDB database is automatically encrypted as well to prevent this from happening.

It’s reasonably obvious to state that data “at rest” doesn’t include the following things:

Data loaded/stored in memory (buffer pool).
Data returned from a query and being passed across the network.
Data received by a client as a result of a query.

If you want to cover those scenarios as well then you need to look at other forms of encryption such as TLS and Always Encrypted.

There are also some less obvious exceptions:

Filestream data.
Data persisted to disk using Buffer Pool Extensions.

And there are a couple of other exceptions that can occur in particular circumstances:

Where the buffer pool gets paged to disk due to memory pressure.
SQL dump files when there is a crash.

What does and doesn’t get encrypted by TDE is summarized in the below diagram:

Let’s have a look at the contents of some SQL data files so you can see the difference with and without TDE. I’ve created a database with a single table and inserted a row of data with the following code:

CREATE DATABASE TestTDE; GO USE TestTDE; GO CREATE TABLE dbo.SomeData (Id INT IDENTITY(1,1), SomeText VARCHAR(255)); GO INSERT INTO dbo.SomeData (SomeText) VALUES('This is my data'); GO

I’ll close my connection from the database, and detach it so I can open the files in a Hex Editor. You can detach the database with the following SQL:

USE master; GO EXEC master.dbo.sp_detach_db @dbname = N'TestTDE';

Then I open the file in my Hex Editor and search for the text “This is my data” in the data file:

As you can see the data is stored as clear as day in the data file.

Now let’s look at the same data file once TDE has been enabled (we will look at enabling TDE in later posts). This time if I search for the same text it’s not found, and my data looks like that shown below.

It’s interesting to also look at the end of the database file where there is free space. In the unencrypted version that free space would have simply been represented by zeros. In the encrypted version that free space too has been encrypted, so an attacker cannot even see where your data ends (Figure 2-4).

TDE works by using an encryption key that is stored in the database being encrypted – but that key is itself stored encrypted by an object outside of the database. We’ll look at the various objects involved in the next post.

Recommended Approach to Encryption in SQL Server

December 14, 2022May 10, 2023 ~ Matthew McGiffen ~ 2 Comments

We looked earlier at the various places our data lives – on disk, in memory, in-transit across the network. A good strategy will protect all of these locations – sometimes with multiple layers. Exactly how you go about doing that for your applications may vary – but as long as you are on SQL Server 2016 or higher there is a default strategy that you should consider. This combines a number of the available SQL Server features to provide the best protection.

TDE. For at-rest protection of all your data.
Always Encrypted. To encrypt all (or most) columns that contain personal identifiable, or sensitive, information.
TLS. To make sure network communications between the application and server are encrypted.
Hashing and Salting of passwords. To make sure passwords are secure – and we never need to store the actual password in the database.

On top of that you should definitely consider using EKM. In previously times, when we had to have dedicated hardware to support it, there was a fair barrier to entry. Now with easily usable and cost-effective cloud solutions it’s easy to get started with EKM, and certainly has many advantages.

Encryption is easiest to build in “by-design” when developing new applications, but more often than not we are implementing or enhancing encryption against our existing applications. It’s great if you are in that situation and have the scope to implement a full encryption strategy as outlined above. Such projects often have time and budget constraints though, or need to be delivered incrementally. If that’s where you are at, then you may want to focus first on where you can achieve the most with the least effort. You need to assess whether implementing any of the features above is going to have a performance impact that worries you, once you’ve dealt with that you might want to look at things in this order:

Hashing and Salting of passwords. I put this item first because you should never be storing passwords in plaintext in a database. If you are encrypting passwords before storing them then that’s better than plaintext but still you should be looking at a hashing method.
TLS. You really should have this on for all your connections between your applications and SQL Server that could contain data or other information you care about protecting. TLS is very easy to set up and can be done very quickly.
TDE. TDE is again very easy to setup, and it comes for free with the standard edition of SQL Server 2016 SP1 onward. It used to require you to be on enterprise edition so that was a good reason why many people didn’t use it. TDE is only going to protect you against a limited number of scenarios, but you still get that for not much more than the flick of a switch.
Always Encrypted. Column encryption with Always Encrypted is a little harder to understand and there are limitations on the ways you can work with encrypted data. It is however the best tool for protecting your personal and sensitive data. If your project has limited resource (what project doesn’t) focus on encrypting first the most sensitive items, and those where you won’t need to make code changes to work around the fact that data is encrypted. Code changes are likely to be required where you need to search against, or perform calculations on, data that you wish to encrypt.

Overview of SQL Server Encryption Features

December 12, 2022May 10, 2023 ~ Matthew McGiffen ~ 1 Comment

SQL Server has had the native ability to encrypt data since SQL Server 2005. This included functionality that could be used to encrypt individual items and columns of data as well as the Transparent Data Encryption (TDE) feature which was available with the enterprise edition of SQL Server and could be used to encrypt all data where it is stored on disk. We then didn’t see significant additions to the features available for encryption until 2016 when Microsoft added Always Encrypted for column encryption. In SQL Server 2019 Microsoft made TDE available in standard edition, and also in SQL Server 2019 the ability to use enclaves was added to Always Encrypted to improve the available functionality for interacting with encrypted data. Finally in SQL Server 2022 further enhancements were made to the set of functionality available when working with Always Encrypted with enclaves.

We can see Always Encrypted (which was introduced in 2016) as the successor to previous methods of column encryption, and in later posts we will go into a high level of detail on Always Encrypted, although we’ll also cover the older methods in brief.

The table below shows the list of encryption features in SQL Server and what each tool is intended to protect.

Encryption features available in SQL Server

Feature	What is Protected
Transparent Data Encryption (TDE)	Data saved to disk. This includes data files, transaction log files, backup files and database snapshots.
Backup Encryption	Backup files.
Always Encrypted	Data stored in columns. With Always Encrypted the data is protected on disk, in memory and in-transit across the network.
Transport Layer Security (TLS)	Network traffic. TLS protects data in-transit across the network as well as commands executed against the database server.
Hashing and Salting	This is not strictly encryption, but we generally use it to protect passwords.
Encryption Functions	Data stored in columns. Here we are referring to the encryption functions introduced in SQL 2005 that pre-date Always Encrypted.
Extensible Key Management (EKM)	This provides extra protection and ease of management for encryption keys by enabling them to be stored with an external provider.

Let’s take a brief look at each of these features in turn.

TDE

TDE protects our data stored on disk, what we often refer to as “at-rest” data. It offers good protection against the scenario where the file system is accessed, and an attacker might attempt to retrieve data directly from the database files themselves – or copy the backup files so they can be restored to another SQL Server to access the data. It doesn’t protect us at all though where an attacker may have access to query the database directly. The “transparent” part of the name refers to the fact that TDE works transparently in the background with no impact on our queries or other application functionality. TDE protects all of the data in a database, unlike methods of column encryption which usually target specific types of information to encrypt.

Backup Encryption

Backup encryption just encrypts our backup files. This includes full backups, differential backups and log backups. This is particularly useful where we might store backups, possibly on tape, off site and want to make sure they are inaccessible if stolen. TDE also does this for us, so we only consider using backup encryption where we can’t use TDE for some reason.

Always Encrypted

Always Encrypted is a form of column encryption. It works hand in hand with the client driver that your application uses to connect to and query the database to ensure that data remains encrypted all the way to the point it reaches your application. That’s what the “always” part of the name refers to. Data is protected at-rest, in-memory and in-transit across the network. Encryption and decryption actually take place within the client driver rather than within SQL Server.

On this blog we will look in depth at two flavors of Always Encrypted. We have the basic version that was introduced in SQL Server 2016, and Always Encrypted with Secure Enclaves that was added in SQL Server 2019. What’s nice about Always Encrypted is that encryption and decryption are carried out automatically for you by the client driver so in many cases you may not even have to make code changes. There are limitations on how you can interact with encrypted data though. The version with enclaves removes some of those restrictions by allowing certain activities to place in a secure portion of memory (called an enclave) on the database server. The use of enclaves does however come with an extra overhead in setup and management.

TLS

TLS is used to encrypt network traffic. That means that data and queries sent between the application and database server are all encrypted. This is similar to SSL, which most people are familiar with for encrypting internet traffic (SSL in most cases actually uses the TLS protocol).

Hashing and Salting

Hashing and salting isn’t actually encryption because it is a one-way process. Hashing is where we run a value through a function that produces a seemingly random output. That output will always be the same for the same input value, but cannot be reverse-engineered to find the original value. Salting is a method to provide extra security for hashed values. Hashing and salting is considered the best practice for storing passwords as it means we don’t even need store actual passwords – so there should be no way for an attacker to access them.

Encryption Functions

Here we refer to the set of encryption functions that SQL Server implements to allow you to encrypt your own data. I see Always Encrypted as the successor to these functions and would recommend you use that where possible. Encryption using the functions is a bit more limiting, a bit less secure and a bit harder to implement than with Always Encrypted. There may be some scenarios where you want to use them though, so we’ll cover them in brief later on – though hopefully in enough detail that it tells you everything you need to know.

EKM

Most encryption is based on keys, and we need to think about how and where we manage them over time. EKM is functionality that allows you to store them outside of your server, either on a piece of kit that sits in a rack in your server room called a Hardware Security Module (HSM), or more commonly these days, using a cloud service like Azure Key Vault. You don’t need to use EKM in order to implement a secure encryption strategy but it’s certainly worth considering due to the ease of management that comes from having all of your keys in one place. It’s also easier to manage policies such as access control when you take a centralized approach to storing your keys.

What do we want to achieve by encrypting our data?

December 5, 2022May 10, 2023 ~ Matthew McGiffen ~ 1 Comment

On the face of it, this is a very obvious question with a very obvious answer. We want to prevent data from falling into the wrong hands. In practice, it gets a little more complicated.

Exactly what types of attacks do you wish to be protected against? It’s good if we make sure our data is encrypted where it is stored on the disk, but that doesn’t help us if an attacker gains direct access to write queries against the database. We might encrypt data held in columns, but does that still protect us if the unencrypted data is being passed back across the network to our application and an attacker is intercepting our network traffic?

Another question is why are you considering encryption in the first place? Often projects consider encryption because relevant regulation, or client requirements, demand it. All too often in these cases, encryption is considered as a binary option, is data encrypted or not. Often what happens is that the bare minimum is done to tick the checkbox and move on. Data might be encrypted, but the protection offered is of value in only limited scenarios.

When we think about what scenarios we wish to be protected against it makes sense to consider where data exists and might therefore be vulnerable. By that I’m not talking about where specific data is held, but rather the types of locations:

In memory on the database server
In transit across the network
In your application
Files stored outside of the database, perhaps on a file share

In subsequent posts we’ll look at how the tools available in SQL Server can protect the first three. I won’t however look at it once it reaches your application; that’s for your application developers to consider. I also don’t talk about files stored outside of the database – but you should think about how you want to protect such items if you have them.

Encryption is only one line of defence and should go hand in hand with a well-defined and implemented approach to security. Your first line of defence is always going to be access controls, making sure that only the right users and applications can access your data and servers in the first place. Some might say that if you have access controls in place, then why do you need encryption at all. The answer is that there is always the potential for access controls to be breached. The best approaches to security are always multi-layered, and on top of access control and encryption, it is good to have auditing in place so you can see who is accessing your systems and what they are doing, as well as having alerting in place for suspicious activity.

Even though we have a good toolset for encryption available to us in SQL Server, it doesn’t come totally for free. Encrypting and decrypting data requires CPU, and so it does have some performance overhead; we’ll discuss that when talking about each tool, and in many cases, we’ll try to look at how you can quantify what level of overhead you might be looking at.

We also will generally have an increase in management overhead, for instance, where we need to manage encryption keys. When implementing encryption it is important to consider how you will manage it on an ongoing basis. One of the worst scenarios you can encounter with encryption is where an individual sets up encryption without telling anyone else where the keys are backed up and then that individual leaves the organization, and if we have a server failure, we may never be able to recover our encrypted data.

The last impact is on functionality; we’ll talk about this a lot when we look at column encryption: where we only store encrypted values in the database that limits how you can interact with them, for instance, searching against such columns or performing calculations. Due to this, an important part of your encryption strategy will be deciding what data to encrypt and how you’re going to work with it once it is encrypted.

When choosing what to encrypt we are usually most focused on personally identifiable information (PII) as well as items deemed particularly sensitive. In considering your approach I’d recommend that something is always better than nothing. If you have a security breach and your list of users is accessed, then that is bad, but it is much better if you are able to say that passwords, credit card information, social security numbers, and other information were not accessed due to the extra encryption on these items.

Query Store Hints in SQL Server 2022

November 30, 2022May 10, 2023 ~ Matthew McGiffen ~ 1 Comment

Another neat little feature in SQL Server 2022 is Query Store Hints. This is the ability to apply a query hint through Query Store rather than having to modify existing code or fiddle around with plan guides.

Query hints are a way to influence optimizer behaviour towards generating desired execution plans for a given query. The word “hint” is a bit of a misnomer as usually they mandate what you wish to happen.

I’m not a big fan of leaping straight away to use query hints as they force the behaviour rather than letting the optimizer work out the best plan for your query. This means you may implement hints that you see as desirable for the current issue you are trying to fix, but they create problems for different executions of the query, or if the data profile changes. In general it’s better to see if you can rewrite the SQL code to achieve what you want, but sometimes that’s not possible and a hint ends up being the best way to go – or at least a good short term solution.

The great thing about applying hints through query store is that it can be done without altering application code. That means the DBA can apply a temporary hint while waiting for code to be fixed – or in extremis use them against third party applications where there is not the ability to get code changed.

I first read about them in Bob Ward’s book SQL Server 2022 Revealed. You can also read about them in Grant Fritchey’s excellent (and long!) book SQL Server 2022 Query Performance Tuning.

Applying a hint is simple as long as the query has a plan stored in query store. First you need to obtain the query_id for the query. One method for doing that is to use the following query I got from this stackexchange post:

SELECT qsq.query_id, qsq.last_execution_time, qsqt.query_sql_text FROM sys.query_store_query qsq INNER JOIN sys.query_store_query_text qsqt ON qsq.query_text_id = qsqt.query_text_id WHERE qsqt.query_sql_text LIKE '%your query text%';

Armed with the query_id, you can then apply the hint using the sys.sp_query_store_set_hints stored procedure. For example, if my query_id was 100 and I want to apply the RECOMPILE hint then I could do that with this SQL:

EXEC sys.sp_query_store_set_hints 100, N'OPTION(RECOMPILE)';

You can also view which queries have hints applied using the sys.query_store_hints view.

Finally if you want to clear a hint you have previously applied (such as the one above), you can do that as follows:

EXEC sp_query_store_clear_hints @query_id = 100;

Query store does not support all query hints. You see the full list of those supported (and not) at:

https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sys-sp-query-store-set-hints-transact-sql?view=sql-server-ver16

And that’s it – it’s that simple. Thanks for reading!

My SQL Server Encryption book has just been published!

November 23, 2022May 10, 2023 ~ Matthew McGiffen ~ 6 Comments

It’s been a long journey, but I got the email from Apress yesterday to say that my book “Pro Encryption is SQL Server 2022” has now been published.

You can buy a copy ~~at a massively reduced introductory price (nearly 80% off)~~ through the following link:

http://www.dpbolvw.net/click-100730415-12898971?url=https%3A%2F%2Flink.springer.com%2Fbook%2F10.1007%2F978-1-4842-8664-7

If I’ve acheived my aims then this is the most comprehensive resource out there covering the tools available in SQL Server for encrypting your data. I hope to supplement it with a video course covering the same topics sometime in early 2023.

The book covers the following topics:

Part I: “Understanding the Landscape”
Discusses what we are hoping to achieve through encryption and why. We then look in brief at the tools available and how they should fit into your overall strategy.

Part II: At-Rest Encryption
Here we look at the tools you can use to encrypt your data where it is stored on the disk. We take a deep dive into Transparent Data Encryption (TDE) as well as covering the separate Backup Encryption feature.

Part III: Column Encryption Using Always Encrypted
This part is a comprehensive look at the Always Encrypted feature, introduced in SQL Server 2016, that allows you to encrypt your columns of data with the maximum level of security and the minimum amount of application rework.

Part IV: Column Encryption Using Always Encrypted with Enclaves
Having looked at the “basic” version of Always Encrypted, we now take an in-depth look at how the version with enclaves differs and how you work with it.

Part V: Completing the Picture
In addition to the big features like TDE and Always Encrypted, there are other items that should be included in a comprehensive encryption strategy such as TLS and EKM. We cover those in this part as well as looking at other methods that are available for encrypting data.

Appendixes
In the appendixes I look at how encryption differs (or not) if you are running on a cloud platform as well as exploring encryption algorithms in a little more depth.

Optimized Plan Forcing in SQL Server 2022

November 16, 2022May 10, 2023 ~ Matthew McGiffen ~ 2 Comments

I’ve been reading Bob Ward’s book “SQL Server 2022 Revealed” and one of the new features in Intelligent Query Processing (IQP) jumped out at me. Not because it’s going to rock my world but rather it reminded me of an idea I had when I first started playing with Query Store when it came out.

I thought it would be cool if the execution plans cached in Query Store could be used to populate the plan cache after an instance restart, removing the need for all queries to be recompiled. I even raised a suggestion on Microsoft Connect for it to be considered as a new feature.

I won’t claim that anyone at Microsoft saw my post and thought “What a great idea!”, but by coincidence they have implemented in SQL 2022 something similar that works in specific scenarios – Optimized Plan Forcing.

The underlying idea behind it is to store – in query store – something called a compilation script that can be used to shorten significantly the compile time for a given query. Any query that runs when query store is enabled will get a compilation script stored if it has a significant compile time (the actual threshold is not disclosed). We can see can see which queries have compilation scripts by looking at the has_compile_replay_script in the sys.query_store_plan view.

If the query plan is forced in Query Store then the next time it needs to be compiled the optimization script will be used to build the plan instead of going through the usual query optimizer , greatly reducing the compile time.

Of course this only works for forced plans, but that makes sense as for other queries we often want a recompile to come up with a new and hopefully better plan. Still it’s an interesting feature. Bob talks about how it was designed in response to a pattern of performance problems they were calling a compile storm. This would be where the plan cache was emptied and then a lot of queries would have to be compiled concurrently causing CPU spikes. By shortening the compile time for some queries – and of course we DBAs can use Query Store to force as many plans as we like (with caution) it’s hoped to reduce the occurrence of such events.

I’m not going to go into trying to demonstrate this in action, but Bob has a full demo in his book which you can buy here:

The feature is turned on by default for databases on SQL Server 2022 as long as they have query store enabled – even if they’re not on the latest compatibility level. You can however turn it off for a given database using:

ALTER DATABASE SCOPED CONFIGURATION SET OPTIMIZED_PLAN_FORCING = ON;

You can also turn it off at the level of an individual query. Either using the query hint DISABLE_OPTIMIZED_PLAN_FORCING, or if you force the plan using the sp.query_store_force_plan system stored procedure then you can disable it using the @disable_optimized_plan_forcing parameter.

SQL Server 2022 is Coming

November 15, 2022May 10, 2023 ~ Matthew McGiffen ~ Leave a comment

We still don’t have an exact release date for SQL Server 2022 but at this stage we can have strong confidence that it will be with us before the end of the year – rather than slipping and becoming SQL Server 2023.

My first intimation that this was so, was earlier this year. I’ve been writing a book for Apress – “Pro Encryption in SQL Server 2022” that’s due to be released in a couple of weeks. At some point I questioned my editor about whether we should hold off as I didn’t want to be the author of a book referencing a version of a product that never existed. His answer was that Bob Ward says it’s happening, we’re releasing his book too, and it’s also got SQL Server 2022 in the title.

Given that Bob’s a Principal Architect for SQL Server with Microsoft, that gave me the confidence I needed. Bob’s book “SQL Server 2022 Revealed” was released a couple of weeks ago and goes over all the new features and enhancements in depth. I’ve started working my way through it and can highly recommend you get yourselves a copy. You get it currently at nearly a 70% discount through the following link:

As well as being a great technologist, Bob is an excellent and engaging writer. In addition to talking about the what the new features are, he gives insights into the design thinking that went into creating them which makes for interesting reading.
Here I reproduce Bob’s diagram that he calls the “Wheel of Power” showing the major new capabilities of SQL Server 2022:

I plan to blog about some of the new features I find most interesting and exciting over the next couple of weeks. I can’t wait to get dug in with the enhancements to Built-in Query Intelligence and Query Store. There are also enhancements to Always Encrypted which I’ve written about in my own book.

The cloud connected features are also very interesting and create new options for us in term of how we manage High Availability and Disaster Recovery as well as being able to offload parts of our workloads to the cloud.

Stay tuned, or if you can’t wait then grab a copy of Bob’s book and start working through the demos – there’s some great stuff. Like most releases on SQL Server, it’s not a quantum leap in computing, but there’s certainly enough for us DBAs to get our teeth stuck into and start thinking about what it offers us.