Statistics are vital in allowing SQL Server to execute your queries in the most performant manner. Having a deep understanding of how the SQL Server Optimizer interacts with Statistics really helps when you are performance tuning
One thing that can be useful when looking at an execution plan is to understand what statistics objects the optimizer used to come up with the plan. In this post we look at how that can be achieved using the undocumented traceflag 8666 which can be used to save internal debugging informational into the plan XML – including details of the Statistics objects used.
There are actually a few other trace flags that do similar things but this seems the simplest and the one that works across the most versions of SQL Server. It looks like it was introduced in SQL 2008 and works on all versions up to and including SQL 2016 – but it doesn’t work on SQL 2005.
Note that this technique only works for plans generated once the trace flag is on, so you can’t view the additional information for existing plans in your cache:
Here’s a quick example using it against the AdventureWorks2012 database. You need to make sure you’ve selected the option to show the actual plan, then run the query as below:
--Turn the trace flag on DBCC traceon (8666); --Run the query SELECT * FROM Person.Person WHERE LastName LIKE 'Smith%'; --Turn the trace flag off again DBCC traceoff (8666);
This query produces the following plan:
If you right-click over the SELECT operator and select properties you see the following:
You’ll notice an extra field “Internal Debugging Information” which contains a bunch of unformatted XML. Rather than grabbing that string and formatting it to make it readable, you can right-click back on the plan itself and select “Show Execution Plan XML…” to view the same information in a more friendly format.
If you scroll down through the XML generated you will get to the following section:
<ModTrackingInfo>
<Field FieldName=“wszStatName” FieldValue=“_WA_Sys_00000007_693CA210” />
<Field FieldName=“wszColName” FieldValue=“LastName” />
<Field FieldName=“m_cCols” FieldValue=“1” />
<Field FieldName=“m_idIS” FieldValue=“2” />
<Field FieldName=“m_ullSnapShotModCtr” FieldValue=“19972” />
<Field FieldName=“m_ullRowCount” FieldValue=“19972” />
<Field FieldName=“ullThreshold” FieldValue=“4494” />
<Field FieldName=“wszReason” FieldValue=“heuristic” />
</ModTrackingInfo>
<ModTrackingInfo>
<Field FieldName=“wszStatName” FieldValue=“IX_Person_LastName_FirstName_MiddleName” />
<Field FieldName=“wszColName” FieldValue=“LastName” />
<Field FieldName=“m_cCols” FieldValue=“1” />
<Field FieldName=“m_idIS” FieldValue=“7” />
<Field FieldName=“m_ullSnapShotModCtr” FieldValue=“19972” />
<Field FieldName=“m_ullRowCount” FieldValue=“19972” />
<Field FieldName=“ullThreshold” FieldValue=“4494” />
<Field FieldName=“wszReason” FieldValue=“heuristic” />
</ModTrackingInfo>
Each ModTrackingInfo node displays the information about one statistics object that has been referenced. The wszStatName shows us the name of the Statistics object. We have two in this case, _WA_Sys_00000007_693CA210 and IX_Person_LastName_FirstName_MiddleName. The first is an auto-generated one on the LastName column (I happen to know this was generated before I created the index referenced). Interestingly SQL has looked at both – even though they contain pretty much the same info and one isn’t really required.
Why is this useful?
Well, apart from simply the interesting aspect of seeing some of what is going on in the background when SQL decides what plan to use for your query, it is often the case with poor performing queries that there is bad cardinality estimation going on. Sometimes this might be because the statistics aren’t accurate and would benefit from being updated. You can use this technique to see what statistics objects are being used for the estimation (where it is not just obvious) and then you can look at the objects themselves and see whether they might want refreshing, or whether ther may be other issues with your data – such as a skewed distribution – that make esimates difficult.
I see growing opinion in the SQL world that the refreshing of statistics is often more important than rebuilding indexes to reduce fragmentation – the latter operation will also refresh the statistics but is a lot more resource intensive. You also will commonly have statistics on columns that are not indexed, and the threshold for them to be refreshed can be pretty large when you have a lot of data. You can see more information about manually refreshing statistics in my previous posts on the subject:
When do Statistics Get Updated?
Automatic Sample Sizes for Statistics Updates
Got a problem or embarking on a SQL Server project and want some help and advice? I’m available for consulting – please get in touch or check out my services page to find out what I can do for you.
One thought on “Viewing the Statistics Objects Used to Create an Execution Plan”