Parameter Sensitive Plan Optimization in SQL Server 2022. What is it and is it any good?

June 8, 2023June 15, 2023 ~ Matthew McGiffen ~ Leave a comment

What is it?

Parameter Sensitive Plan (PSP) optimization is a new feature in SQL Server 2022 that aims to improve the performance of parameterized queries. It is part of the Intelligent Query Processing family of features and addresses the scenario where a single cached plan for a parameterized query isn’t optimal for all possible incoming parameter values. This is the case with non-uniform data distributions.

PSP optimization automatically enables multiple, active cached plans for a single parameterized statement. Cached execution plans will accommodate different data sizes based on the actual provided runtime parameter value(s).

During the initial compilation, column statistics histograms identify non-uniform distributions and evaluate the most at-risk parameterized predicates, up to three out of all available predicates. In other words, if multiple predicates within the same query meet the criteria, PSP optimization chooses the top three.

For eligible plans, the initial compilation produces a dispatcher plan that contains the PSP optimization logic called a dispatcher expression. A dispatcher plan maps to query variants based on the cardinality range boundary values predicates.

Here’s an example to illustrate how PSP optimization works:

Let’s say we have a table Orders with columns OrderID, CustomerID, OrderDate, and TotalAmount. The table has a non-uniform distribution of data, with some customers having many orders while others have only a few.

We have a stored procedure that retrieves all orders for a given customer:

CREATE PROCEDURE dbo.GetCustomerOrders (@CustomerID int)
AS
BEGIN
   SELECT * FROM dbo.Orders WHERE CustomerID = @CustomerID;
END

Without PSP optimization, SQL Server would generate a single execution plan for this query, based on the parameters supplied the first time it is called, and cache it for reuse. However, this plan may not be optimal for all possible values of @CustomerID.

With PSP optimization enabled, SQL Server will generate multiple execution plans for this query – one for low cardinality ranges, one for medium cardinality ranges, and one for high cardinality ranges. At runtime, SQL Server will choose the optimal execution plan based on the value of @CustomerID.

For example, if @CustomerID corresponds to a customer with many orders (high cardinality), SQL Server will choose the high cardinality range execution plan. If @CustomerID corresponds to a customer with few orders (low cardinality), SQL Server will choose the low cardinality range execution plan.

This allows SQL Server to determine the optimal execution plan based on the parameter value(s), improving query performance.

PSP aims to reduce parameter sniffing, a situation where SQL Server generates an execution plan based on the first parameter value it encounters and then reuses that plan for all subsequent parameter values. This can lead to suboptimal performance for some parameter values.

In SQL Server 2022, PSP optimization is automatically enabled by default. However, it can be disabled at the database or query level using the following T-SQL statement:

ALTER DATABASE [database_name] SET PARAMETER_SENSITIVE_PLAN_OPTIMIZATION = OFF;

Is it any good?

The biggest problem in terms of fixing performance is that cardinality estimation is based on statistics objects which are effectively a histogram (with up to 200 steps) of data distribution across a certain column. In the example above, if we have statistics on CustomerID then we may have accurate figures if we have less than 200 customers, but if we have a lot more then they will be grouped in ranges and we may have a lot of data skew within a range, e.g. CustomerID 1 may have 10 orders where CustomerID 2 has 10,000. That means PSP optimization may still not always choose the correct plan.

Also, there are lots of parameter sniffing scenarios which can depend on a lot more than 3 parameters – such as “kitchen-sink” search procedures where you can search data based on one or more of many different parameters.

That said, there will be many scenarios where it does help – it’s just not a magic bullet to fix parameter sniffing.

Brent Ozar talks about another issue in his post How SQL Server 2022 Tries to Fix Parameter Sniffing. That is the issue around monitoring and performance troubleshooting. When you try to view the execution plan for a stored procedure where the optimizer has decided to use PSP optimization, you only see the dispatcher plan, not the actual plans that will be used in practice. Also, should your monitoring or performance troubleshooting tools flag up a query that is causing problems, where that query is using PSP optimization, it is difficult to link that query back to the stored procedure it belongs to. Not impossible, but it is a pain.

I attended a talk by Conor Cunningham, Principal Software Architect at Microsoft on SQL Server, a few years back at SQL Bits, where he was presenting the original set of features collectively known as Intelligent Query Processing. He pointed out that a lot of people running SQL Server, especially on Azure with Azure SQL Database and Azure SQL Managed Instance, don’t have SQL Server performance tuning expertise in house and these sorts of features were targeted at these people most of all. Features that will not solve all performance issues of a given type, but will certainly help in a number of circumstances.

From that point of view PSP optimization is a good addition to the set of features, even if it doesn’t do as good a job as someone with real expertise might, and even if it does come with it’s own problems.

Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.

Why you still need DBA resource in the cloud

May 19, 2023June 15, 2023 ~ Matthew McGiffen ~ 1 Comment

I was chatting with a cloud consultant who was advising on a large scale migration to AWS. He told me that one of the advantages of going for a PaaS offering (Platform as a Service) was that DBAs were no longer required as backups and restores were handled for you. PaaS services for SQL Server include AWS RDS and Azure SQL Database or Azure SQL Managed Instance.

I found it quite a funny conversation, partly as I don’t think he realised being a DBA was part of my job role, but also because I don’t know a single DBA who spends a significant amount of their time doing backups and restores.

It’s true that the services mentioned handle a lot of the routine tasks for you and that is great, it’s also reasonable to hope that they manage databases with greater reliability. Rather than replace the DBA though, they instead free them up to achieve more.

One common pitfall I’ve seen with cloud implementations is to assume that everything is done for you and that can lead to poorly configured services or ones with missing maintenance. It remains important to have access to someone who has a deep understanding of SQL Server when provisioning a cloud service to host your databases. It’s also important they understand the cloud options available and can make sure you have the right settings and instance type etc.

And once things are running you still need someone to turn to when there are problems. These problems are less likely to be hardware related, but you are still going to see occasional performance (or other) issues, just as you would when you were running on-premises hardware.

When those things happen you need access to someone with the right skills to diagnose the problem and perform analysis to define the right solution. Enter the cloud DBA.

As mentioned in my previous post The Importance of SQL Server Performance Tuning in the Cloud, ongoing performance tuning becomes even more important in the cloud as database services are expensive, a performance optimized database is also a cost-optimized one, and you can realise saving immediately.

Brent Ozar talks about the importance of DBAs in the cloud in this post https://www.brentozar.com/archive/2019/06/what-happens-to-dbas-when-we-move-to-the-cloud/. He tells a story that I’m going to quote here:

One of my favorite recent examples was a company who came to me saying, “We’re spending about $2M per year in the cloud just on our databases alone. Can you help us reduce those costs?” Absolutely: with just a couple of days spent query & index tuning, we chopped their biggest database expenses in half while increasing performance.

At the end of that engagement, the CTO told me, “I thought I’d save money in the cloud by not having a DBA, but what I’m learning is that in the cloud, I actually get a return on my DBA investments.”

And it’s not all about performance. Someone recently reached out to me to say that they had mysteriously “lost” about 400GB of free space overnight on an AWS RDS instance running SQL Server, and wanted help identifying the cause. That led to identifying an issue relating to a recent code change that needed fixing.

Of course it’s possible you might not need a full-time DBA in the cloud – or otherwise. That’s where it’s useful to have access to someone who can jump in to perform that role as required. Ideally as they work with you over time they will understand your systems and be able to talk in the context of your applications when discussing problems and solutions.

I can help you with all that. You can find more details of my services here or simply get in touch to discuss what I can do for you.

The Importance of SQL Server Performance Tuning in the Cloud

May 12, 2023June 9, 2023 ~ Matthew McGiffen ~ 1 Comment

SQL Server Performance tuning is always important and I can list off reasons why:

Improved application response.
Identifying issues before they become outages.
Potential cost savings.

In terms of the cost savings however, that is always a little speculative when running on premises. I can take your 8 core box running at 70% CPU peak and get that down to 35%. Then you could potentially migrate to a smaller box with 4 cores and save up to $7,000 a year for standard edition or about $26,000 for enterprise. If you’re running on bigger kit then the potential savings are much larger. Often though SQL Server runs on physical machines so downsizing isn’t that straightforward and there isn’t the appetite for migrating to different kit.

In the cloud it’s different, you can change the size of the instance running your SQL Server quickly and easily with no real risk, so cost savings can be realised immediately. That applies whether you’re running SQL on cloud Virtual Machines or using a Platform as a Service (PaaS) offering such as Azure SQL Database or Amazon RDS. If you’re running the serverless option on Azure SQL Database then you can start seeing savings without even having to change configuration of the instance.

Let’s look at some figures showing the potential cost savings. The table below shows the cost of running 4, 8 or 16 vCPU instances in Azure and AWS. For Azure I’ve chosen provisioned compute on the General Purpose Service Tier running on Standard hardware. For AWS I’ve chosen M5 as the instance type with SQL Server Standard edition and Multi-AZ for high availability. The figures shown are on-demand costs per year:

Number of vCPUs	Azure SQLDB	AWS RDS
16	$35,346	$88,423
8	$17,673	$44,641
4	$8,836	$21,444

Cost per year of running SQL Server in Azure and AWS

You can see the pricing calculators I’ve based these figures on here:

https://azure.microsoft.com/en-gb/pricing/details/azure-sql-database/single/

https://aws.amazon.com/rds/sqlserver/pricing/?nc=sn&loc=4

As a side note, we can see that AWS is a lot more expensive. There are a number of reasons why I see Azure as a better fit for running SQL Server workloads and price is one. That said, the database isn’t the only factor in choosing your cloud provider so it’s perfectly legitimate that you might be running on AWS.

When I’m asked to look at performance of a SQL instance, it’s rare that I can’t knock performance in half and often we can achieve much more. You can see the potential cost savings are significant. One large project I’ve worked on recently called me in where the application was struggling to perform running on 32 vCPUs in AWS RDS. I identified changes that meant it was performing perfectly running on 16 vCPUs, and soon after we were able to move it down to 8. A few more rounds of optimisation and now it can run on 4. That represented a saving of over $150,000 annually.

Get in touch today to discuss how I can help you with your cloud SQL Server performance and save you a lot of money year on year.

My SQL Server Encryption book has just been published!

November 23, 2022May 10, 2023 ~ Matthew McGiffen ~ 6 Comments

It’s been a long journey, but I got the email from Apress yesterday to say that my book “Pro Encryption is SQL Server 2022” has now been published.

You can buy a copy ~~at a massively reduced introductory price (nearly 80% off)~~ through the following link:

http://www.dpbolvw.net/click-100730415-12898971?url=https%3A%2F%2Flink.springer.com%2Fbook%2F10.1007%2F978-1-4842-8664-7

If I’ve acheived my aims then this is the most comprehensive resource out there covering the tools available in SQL Server for encrypting your data. I hope to supplement it with a video course covering the same topics sometime in early 2023.

The book covers the following topics:

Part I: “Understanding the Landscape”
Discusses what we are hoping to achieve through encryption and why. We then look in brief at the tools available and how they should fit into your overall strategy.

Part II: At-Rest Encryption
Here we look at the tools you can use to encrypt your data where it is stored on the disk. We take a deep dive into Transparent Data Encryption (TDE) as well as covering the separate Backup Encryption feature.

Part III: Column Encryption Using Always Encrypted
This part is a comprehensive look at the Always Encrypted feature, introduced in SQL Server 2016, that allows you to encrypt your columns of data with the maximum level of security and the minimum amount of application rework.

Part IV: Column Encryption Using Always Encrypted with Enclaves
Having looked at the “basic” version of Always Encrypted, we now take an in-depth look at how the version with enclaves differs and how you work with it.

Part V: Completing the Picture
In addition to the big features like TDE and Always Encrypted, there are other items that should be included in a comprehensive encryption strategy such as TLS and EKM. We cover those in this part as well as looking at other methods that are available for encrypting data.

Appendixes
In the appendixes I look at how encryption differs (or not) if you are running on a cloud platform as well as exploring encryption algorithms in a little more depth.

The 99 Flavours of SQL Server are all Vanilla

August 22, 2019May 10, 2023 ~ Matthew McGiffen ~ 1 Comment

There are so many options for running SQL Server now. It can feel a little intimidating.

You’ve spent years gaining experience with SQL running on Windows Server. On bare metal or in a VM. Locally or in a data centre.

Now people are talking about SQL on Linux. SQL on VMs in the cloud. SQL Server Platform as a Service (PaaS) offering such as Azure SQL DB, Azure Managed Instance, or Amazon RDS.

Running SQL in a container on Docker or Kubernetes.

It’s easy to feel like you’re at risk of getting left behind. Feel like you’ve got to start learning from the beginning all over again.

The good news is that whatever the flavour – it’s still SQL Server. For everything that’s different there are a hundred things that are still the same.

Database design, writing and executing queries and stored procedures – it’s still the same.

Query tuning, indexing, looking at execution plans. All the same.

Even administration, many of the core concepts are the same, understanding how security works, backups, high-availability. The main difference is often that some of these might be taken care of for you and you don’t need to worry about them any more.

Caveat – you still need to worry about them a bit!

The point is, most of what you already know, the experience you have gained over the years, is still totally valid. Learning about SQL Server on a new platform may feel like a big learning curve, but in reality, the new stuff you need to get to grips with is small compared to all the stuff you already know.

And in some cases, the skills you already have become even more valuable. People might not care if your query tuning on physical kit takes your CPU down from 50% to 10%. But tell them you’ve just reduced their cloud bill by 80% and they really care!

So don’t be intimidated, and don’t feel you need to learn every flavour. Have a play with SQL Server in the cloud, have a play with containers, set up SQL on Linux. You’ll quickly find it’s not that hard, and once it’s running – it’s pretty much the same as ever.

And remember, if someone comes to you with a question about why SQL is running slow, or why a query isn’t doing what they want – on RDS, Docker, Linux, or whatever. You don’t need to know that platform inside out to be able to help, you already know SQL Server and that’s the important bit.

To paraphrase a popular lyric :

If you’ve got SQL problems I can help you son. I’ve got 99 problems but SQL aint one.

MAXDOP, Parallelism and the Cloud

April 3, 2019May 10, 2023 ~ Matthew McGiffen ~ 2 Comments

Parallelism and MAXDOP

The pros and cons of parallelism have always been with us in SQL Server and I blogged about this a couple of years ago. This is an updated version of that post to include details of the new wait stat related to parallelism that was added in 2017 (CXCONSUMER), as well as to discuss the options available for cloud based SQL Server solutions.

There’s no doubt that parallelism in SQL is a great thing. It enables large queries to share the load across multiple processors and get the job done quicker.

However it’s important to understand that it has an overhead. There is extra effort involved in managing the separate streams of work and synchronising them back together to – for instance – present the results.

That can mean in some cases that adding more threads to a process doesn’t actually benefit us and in some cases it can slow down the overall execution.

We refer to the number of threads used in a query as the DOP (Degree of Parallelism) and in SQL Server we have the setting MAXDOP (Maximum Degree of Parallelism) which is the maximum DOP that will be used in executing a single query.

Microsoft generally recommend caution setting MAXDOP above 8:

https://support.microsoft.com/en-gb/help/2806535/recommendations-and-guidelines-for-the-max-degree-of-parallelism-configuration-option-in-sql-server

Here’s a nice post from Kendra Little talking about how having higher settings can actually slow down your query execution time:

https://www.brentozar.com/archive/2013/12/q-can-high-maxdop-make-a-query-slower/

Out of the box, MAXDOP is set to 0, which means there is no limit to the DOP for an individual query. It is almost always worth changing this to a more optimal setting for your workload.

Cost Threshold for Parallelism

This is another setting available to us in SQL Server and defines the cost level at which SQL will consider a parallel execution for a query. Out of the box this is set to 5 which is actually a pretty low number. Query costing is based on Algorithm’s from “Nick’s machine” the box used by the original developer who benchmarked queries for Microsoft.

(Nick’s Machine)

Compared to modern servers Nick’s machine was pretty slow and as the Cost Threshold hasn’t changed for many years, it’s now generally considered too low for modern workloads/hardware. In reality we don’t want all our tiny queries to go parallel as the benefit is negligible and can even be negative, so it’s worth upping this number. Advice varies but generally recommendations say to set this somewhere in the range from 30 to 50 (and then tuning up and down based on your production workload).

There are many articles in the SQL Server community about how the out of the box setting is too low, and asking Microsoft to change it. Here’s a recent one:

http://www.scarydba.com/2017/03/13/change-the-cost-threshold-for-parallelism/

CXPACKET and CXCONSUMER waits

Often in tuning a SQL Server instance we will look at wait stats – which tell us what queries have been waiting for when they run. CXPACKET waits are usually associated with parallelism and particularly the case where multi-threaded queries have been stuck waiting for one or more of the threads to complete – i.e. the threads are taking different lengths of time because the load hasn’t been split evenly. Brent Ozar talks about that here:

https://www.brentozar.com/archive/2013/08/what-is-the-cxpacket-wait-type-and-how-do-you-reduce-it/

High CXPACKET waits can be – but aren’t necessarily – a problem. You can cure CXPACKET waits by simply setting MAXDOP to 1 at a server level (thus preventing parallelism) – but this isn’t necessarily the right solution. Though in some cases in can be, SharePoint for instance is best run with MAXDOP set to 1.

What you can definitely deduce from high CXPACKET waits however is that there is a lot of parallelism going on and that it is worth looking at your settings.

To make it easier to identify issues with parallelism, with SQL Server 2017 CU3 Microsoft added a second wait type related to parallelism – CXCONSUMER. This wait type was also added to SQL Server 2016 in SP2.

Waits related to parallelism are now split between CXPACKET and CXCONSUMER.

Here’s the original announcement from Microsoft regarding the change and giving more details:

https://blogs.msdn.microsoft.com/sql_server_team/making-parallelism-waits-actionable/

In brief, moving forward CXPACKET waits are the ones you might want to worry about, and CXCONSUMER waits are generally benign, encountered as a normal part of parallel execution.

Tuning Parallelism

In tuning parallelism we need to think about how we want different sized queries to act on our server.

Small Queries

In general we don’t want these to go parallel so we up the Cost Threshold to an appropriate number to avoid this. As discussed above 30 is a good number to start with. You can also query your plan cache and look at the actual costs of queries that have been executed on your SQL Instance to get a more accurate idea of where you want to set this. Grant Fritchey has an example of how to do that here:

http://www.scarydba.com/2017/02/20/estimated-costs-queries/

As he mentions in the post, this assessment can be quite expensive to run – so do it when things are quiet.

Medium to Large Queries

This is where we want to take advantage of parallelism, and do so by setting MAXDOP to an appropriate level. Follow the guidelines from the Microsoft article referenced above. Here it is again:

https://support.microsoft.com/en-gb/help/2806535/recommendations-and-guidelines-for-the-max-degree-of-parallelism-configuration-option-in-sql-server

Often the answer is going to be simply to set it to 8 – but then experiment with tuning it up and down slightly to see whether that makes things better or worse.

Very Large Queries

If we have a mixed workload on our server which includes some very expensive queries – possibly for reporting purposes – then we may want to look at upping the MAXDOP for these queries to allow them to take advantage of more processors. One thing to consider though is – do we really want these queries running during the day when things are busy? Ideally they should run in quieter times. If they must run during the day, then do we want to avoid them taking over all the server power and blocking our production workload? In which case we might just let them run at the MAXDOP defined above.

If we decide we want to let them have the extra power then we can override the server MAXDOP setting with a query hint OPTION(MAXDOP n):

https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query

You will want to experiment to find the “best” value for the given query. As discussed above and as shown in Kendra Little’s article, just setting it to the maximum number of cores available isn’t necessarily going to be the fastest option.

Exceptions to the Rule

Regardless of the size, there are some queries that just don’t benefit from parallelism so you may need to assess them on an individual basis to find the right degree of parallelism to use.

With SQL server you can specify the MAXDOP at the server level, but also override it at the database level using a SCOPED CONFIGURATION or for individual queries using a query hint. There are even other ways you can control this:

https://www.brentozar.com/archive/2016/12/ten-ways-set-maxdop/

Options in the Cloud

If your SQL Server is hosted in the cloud, then most of this still applies. You still need to think about tuning parallelism – it isn’t done for you, and the defaults are the same – so probably not optimal for most workloads.

There are in general two flavours of cloud implementation. The first is Infrastructure as a Service (IaaS) where you simply have a VM provided by your cloud provider and run an OS with SQL server on top of it in that VM. Regardless of your cloud provider (e.g. Azure, AWS etc.), if you’re using IaaS for SQL Server then the same rules apply, and you go about tuning parallelism in exactly the same way.

The other type of cloud approach is Platform as a Service (PaaS). This is where you use a managed service for SQL Server. This would include Azure SQL Database, Azure SQL Database Managed Instance, and Amazon RDS for SQL Server. In these cases, the rules still apply, but how you manage these settings may differ. Let’s look at that for the three PaaS options mentioned above.

Azure SQL Database

This is a single SQL Server database hosted in Azure. You don’t have access to server level settings, so you can’t change MAXDOP or the cost threshold. You can however specify MAXDOP at the database level e.g.

ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 4;

Cost threshold for Parallelism however is unavailable to change in Azure SQL Database.

Azure SQL Database Managed Instance

This presents you with something that looks very much like the SQL Server you are used to, you just can’t access the box behind it. And similar to your regular SQL instance, you can set MAXDOP and the Cost threshold as normal.

Amazon RDS for SQL Server

This is similar to managed instance. It looks and acts like SQL Server but you can’t access the machine or OS. You access your RDS instance through an account that has permissions that are more limited than your usual sa account or sysadmin role allows. And one of the things you can’t do with your limited permissions is to change the parallelism settings.

Amazon have provided a way around this though and you can change both settings using something called a parameter group:

https://www.mssqltips.com/sqlservertip/5329/setting-sql-server-configuration-options-with-aws-rds-parameter-groups/

Closing Thoughts

Parallelism is a powerful tool at our disposal, but like all tools it should be used wisely and not thrown at every query to its maximum – and this is often what happens with the out of the box settings on SQL Server. Tuning parallelism is not a knee-jerk reaction to high CXPACKET waits, but something we should be considering carefully in all our SQL Server implementations.

Acknowledgements

I wanted to update my original article to include the cloud options noted above, but didn’t have access to an Azure SQL Database Managed Instance to check the state of play. Thanks to TravisGarland via Twitter (@RockyTopDBA) and Chrissy LeMaire via the SQL community slack (@cl) for checking this and letting me know!

Checking Progress of the Creation of your Azure Resources

February 26, 2019May 10, 2023 ~ Matthew McGiffen ~ Leave a comment

When I’ve created resources in Azure it’s usually taken from a few minutes and up to quarter of an hour – though sometimes longer.

When you’re new to this stuff, you can be uncertain and wonder, “Is it really creating it?”, “Did I hit the right buttons?”. As a result it can be handy to know where to check to see what’s going on.

Sometimes after creating the resource you are taken to a screen that will show you what’s going on:

And usually you can see something is occurring from the bar at the top:

If you click on the alarm icon you can see more details:

You can then click to see “More events in the activity log” to dig deeper:

This is all fairly intuitive, but earlier I was trying to create a SQL Database Managed Instance for the first time. It showed some activity in the items above for a few minutes, but after that nothing happened. Had it failed? Had I done something wrong? Should I start again and try to create a new one?

The answer was to select resource groups from the blades on the left, and select the resource group that I had created the item in:

On the right hand side I can see an item saying “Deployments” and I can see that one is in the process of deploying. I can click the hyperlink for more details:

The third item in the list was the one I was looking for:

Okay, so it is in the process of being created. There’s no way to tell how long it will take but at least I now know it’s happening.

While searching for it I did notice a warning on the create screen for the resource that I hadn’t seen when I first whizzed through the creation:

Looks like I might be waiting a while…

Some differences with SQL Server when running on AWS RDS

December 12, 2018May 10, 2023 ~ Matthew McGiffen ~ 5 Comments

If you plan on using Amazon Web Services (AWS) to host your SQL Server based applications in the cloud, then you have a couple of options.

One is just to have an EC2 instance (a VM) and install the versions of the OS and SQL Server you want. There are also images you can use that will have these pre-installed. This is what’s known as the IaaS option (Infrastructure as a Service). If you take this option, then SQL Server is exactly the same as it would be if you had it on-prem.

Or you can go with Amazon RDS (Relational Database Service). This is more of a managed service where Amazon looks after some aspects of your database server for you. In return you give up some of the control you would have with your own server or VM. You can still pick the version of SQL Server you want installed, usually down to which cumulative update you want – though note that RDS normally lags behind the latest box version of SQL by 3 months or so. RDS is what’s known as a PaaS offering (Platform as a Service).

So, what do you give up and what do you gain? Here’s a quick summary of a few things I’ve noticed. This is not intended to be comprehensive and please bear in mind that AWS is a fast-moving beast – changes happen regularly.

What you can access

You can still connect to and manage your instance using SSMS, but you have no direct access to the server hosting it, such as configuration of the OS, or access to the disks. Through SSMS you have an access level slightly below Sa – which limits you to only what is allowed.

High Availability

In RDS you don’t use Availability Groups, Log Shipping or any of that stuff. Instead, HA is achieved using multiple Availability Zones (AZ). With this enabled, writes are performed synchronously to a replica in a second availability zone (usually a separate physical data centre in the same region). If the primary AZ goes down, then failover will happen automatically. You can also choose to replicate to other regions to be protected again failure of an entire AWS region – though often data protection laws mean that might not be an option for you if you hold personal data and there is only one AWS region in your country.

AWS RDS Multi-AZ promises a monthly up-time of 99.5% (which allows for 22 minutes of downtime per month). Full details of the SLA are here:

https://aws.amazon.com/rds/sla/

Disaster Recovery

RDS performs automated backups of your whole instance, including the equivalent of log backups every 5 minutes. That means that in case of a disaster, where Multi-AZ failover fails, or where someone deletes data they shouldn’t, then the maximum data loss (RPO) is 5 minutes. The built-in functionality allows you to restore your instance to a point in time, which is implemented by restoring to a new instance. It is not possible to recover individual databases. Backup retention can be set up to maximum of 35 days.

If you need more than that then you can still take native SQL backups, but this has to be enabled specifically and because you can’t access the underlying disks you have to use a stored procedure specific to RDS:

https://aws.amazon.com/premiumsupport/knowledge-center/native-backup-rds-sql-server/

There are also other options – such as taking extra snapshots and storing them in a separate region.

Patching

AWS handles patching of minor versions to your instance for you. You can choose for this to be done automatically or triggered by manual intervention through the AWS console.

Encryption

RDS has its own at-rest encryption similar to TDE. This is available for all instances hosted on RDS so unlike TDE you don’t need to be on an Enterprise Edition of SQL Server.

Collation

All SQL Server instances on RDS are set up with a server collation of SQL_Latin1_General_CP1_CI_AS and you can’t change this. You can specify a different collation for your databases but this might mean you run into collation issues if you use temporary tables and compare string based columns with those in your databases – as temp tables are created with the collation of the instance.

Server Memory

In an on-premise version of SQL Server we would carefully provision a proportion of the overall server memory for SQL Server. On RDS this is not an available setting, the amount of memory is fixed with our general instance sizing, however Amazon do make recommendations for tuning the amount of memory given to an RDS instance – which would mean resizing the instance if necessary.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html#CHAP_BestPractices.Performance.RAM

The recommendation is that there should be enough memory that the “working set” of data is retained in memory. Memory should be tuned so that the ReadIOPS metric is “small and stable”.

Recovery Model

This is worth mentioning even though it is not entirely configurable in RDS. If you have backup retention set to longer than 0 days (i.e. take backups) then the recovery model will be set to FULL. If you set backup retention to zero (which disables backups) then recovery model will be set to SIMPLE. If you manually change the recovery model, RDS will automatically change it back within 5 minutes. This applies to all databases.

MAXDOP and parallelism

In RDS you cannot change the instance level MAXDOP, or the “Cost Threshold for Parallelism” through SQL. Instead these must be configured through a parameter group:

https://www.mssqltips.com/sqlservertip/5329/setting-sql-server-configuration-options-with-aws-rds-parameter-groups/

Optimize for Ad-hoc Workloads

Like the parallelism settings this cannot be modified through SQL. So, if you want this setting enabled you must use a parameter group:

https://www.mssqltips.com/sqlservertip/5329/setting-sql-server-configuration-options-with-aws-rds-parameter-groups/

Instant File Initialization

This setting means that when files grow the new space can be allocated immediately without taking time to fill the space with zeros.

This cannot be enabled in RDS, so it is of extra importance to size databases appropriately to avoid auto-growth where possible.

Extended Events vs Profiler

Unfortunately, Extended Events is not available in RDS, so if you wish to trace events you must use traditional traces/profiler.

Note that when setting up a trace against a busy instance you should create a server-side trace to minimize the impact on performance.

That’s the list of key differences I’ve noticed so far in working with RDS to provision new SQL instances.

What stays the same

There are a few other things you may be wondering if you still have control over (I know I was). So just to confirm, you can still do the following:

Configure multiple files for TempDB
Use either SQL or Windows authentication (or mixed)
Schedule jobs with SQL Server Agent
Run DBCC CHECKDB
Rebuild indexes and statistics

Please comment on this post if you notice any other significant differences you think people should be aware of – or if you notice updates to AWS that make any of these points invalid.

RDS encryption vs TDE

April 17, 2018February 18, 2021 ~ Matthew McGiffen ~ Leave a comment

If you’re starting to use cloud services, the number of options available can be confusing. Particularly when they may seem to do the same thing.

If you’re on AWS, and you want to create a SQL Server instance on RDS (Relational Database Service), then you potentially have a couple of different options for enabling encryption at rest.

If you’re deploying an Enterprise Edition SQL Server instance then you could use TDE (Transparent Database Encryption), the technology most of us in the SQL Server world already have some awareness of.

RDS also has its own at-rest encryption though, so what’s the difference?

The answer is that (at least in terms of what they protect) they are pretty much equivalent. RDS encryption can be used with whichever database platform you choose to use – Aurora, MySQL, MariaDB, PostgreSQL, Oracle or SQL Server. As many people want this functionality it made sense for Amazon to provide it.

Only Oracle and SQL Server have their own built-in equivalent – TDE – so in these cases you have a choice of which want you want to use. You may prefer to use the one you are familiar with from your on-premise deployments, or you may prefer to go with the RDS one.

A key difference is going to be database backups. RDS handles backups for you, and of course these backups will be encrypted whichever option you choose. However, you would be wise to also take your own set of backups which you store outside the RDS instance. With RDS encryption these backups would not be encrypted, however with TDE they would be.

Even this though is not a killer point, with SQL Server (from 2014 onward) you have backup encryption, so even if you were using RDS, you could use this to make sure that externally stored backups were also encrypted.

A big reason you might want to use the RS encryption is price. TDE is only available on SQL Server Enterprise Edition, whereas you can use RDS encryption on Standard Edition also.

UPDATE (03/12/2019): From SQL Server 2019 TDE is available in standard edition, so price is no longer a factor.

The difference in cost for SQL Server Standard vs Enterprise licences is significant. Last time I checked the standard price was around $2,000 dollars per CPU core for Standard, but $7,000 with Enterprise.

The price difference in RDS is also large. If I look at a “db.m4.xlarge” instance which is 4 virtual CPUs and 16GB RAM, then the price quoted is roughly $750 dollars a month for SQL Server Standard Edition, $1,650 for Enterprise.

Of course, there are differences between each technology in terms of how you set it up and manage it, how and where the keys are stored etc. But if you’re confronted with the choice, then mostly, you can just pick the one you prefer, it doesn’t really matter.

What does matter is that if you prefer not to pay for Enterprise Edition, then you still have the option of at-rest encryption. Which is great news.