This article is contributed. See the original author and article here.
Create/Update statistics allows users to speicfy the sample rate. However, the sample rate may not work as you expected in some scenarios.
1.Tables that have not greater than 1024 pages
I’m going to use the table Production.Product in AdventureWorks2019 to demonstrate
create statistics IProductID on Production.Product(ProductID) with sample 20 percent
In this script, Sample rate is set to 20%. However, the ‘DBCC how_statistics’ shows that the ‘Rows’ equals to ‘Rows Sampled’, which means it’s 100% sampled.
Because for table with less than 1024 pages in the clustered index(if the table is heap, we count the indexid 0) of table, SQL Server ignores the sample specified and always use 100% sampled.
In this case, the Prodcution.Product only has 15 pages, hence it’s always 100% sampled.
Please note, sample 0 is an exception. If you specify 0, SQL Server does not create histogram.
2.Tables that have more than 1024 pages
SQL Server guarantee that at least 1024 pages will be sampled. If the sample rate specified is less than 1024, SQL Server will replace it with 1024 pages. If it’s greater than 1024 pages, SQL Server will use following formula as sample rate:1024/Total Pages.
3.What if sample rate is not specified?
If the pages is greater than 1024, SQL Server picks up from smaller one from following two
For large table, the rows in the 1024 pages can be ignored.
A table has 1,000,000, then 15*power(1000000.0,0.55) =29929 rows will be sampled, almost 29929/1000000=2.9%
A table has 10,000,000, then 15*power(10000000.0,0.55) =106192 rows will be sampled, almost 106192/1000000=1.06%
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.