This article is contributed. See the original author and article here.
Hi Team, Eric Jansen here – I’m a Platforms Customer Engineer with some interesting scenarios that I’ll be sharing over the coming months, mostly revolving around the topic of On-Premises DNS. Today’s topic will be a very specific scenario regarding DNS Policy. DNS Policy has been around since the debut of Windows Server 2016 and it was a massive leap forward in functionality for Windows On-Prem DNS. There’s plenty of content regarding that topic, so that’s not what I’ll be talking about today, but if you want an overview, check out the following:
https://docs.microsoft.com/en-us/windows-server/networking/dns/deploy/dns-policy-scenario-guide
More specifically, today’s topic is regarding the removal of DNS Query Resolution Policies (or from here forward ‘QRPs’ – ‘Query Resolution Policies’ is a lot of typing  ) in large scale.  To provide some context, a number of my customers use DNS as one of their methods for blocking their clients from getting to unwanted domains that may have been identified as malicious or go against corporate policy, or whatever the reason may be.  With that said, there are customers that have hundreds of thousands of these policies to block hundreds of thousands of domains.  So now comes the inevitable question, when they first try to remove all the policies in doing their initial testing with so many policies – Why is it so ridiculously slow to remove my Query Resolution Policies!?
) in large scale.  To provide some context, a number of my customers use DNS as one of their methods for blocking their clients from getting to unwanted domains that may have been identified as malicious or go against corporate policy, or whatever the reason may be.  With that said, there are customers that have hundreds of thousands of these policies to block hundreds of thousands of domains.  So now comes the inevitable question, when they first try to remove all the policies in doing their initial testing with so many policies – Why is it so ridiculously slow to remove my Query Resolution Policies!?
The answer to that is… Well, it depends. It depends on how you attempted to remove them.
With that said, let’s look at a scenario in the lab using a domain joined member server that has the DNS Role installed, where customer X is trying to remove, let’s say 50,000 QRPs. Below are some options and we’ll measure the time it takes for each option to complete, but first a Pro Tip:
Pro Tip: Incase this is your first rodeo – Always test in a lab first and have a thorough understanding of what you’re doing before making production changes. When you do make your changes, in your lab and in production, have a backout plan.
Option 1:
Get-DNSServerQueryResolutionPolicy | Remove-DNSServerQueryResolutionPolicy -Force
Sounds logical to me…but unfortunately this doesn’t work for all scenarios, especially if you have a very large number of domains.
Eventually, at least in all of my testing, it’ll fail and throw an exception ID of Win32 167. If you dig into the exception further ($Error[0].Exception.ErrorData) it translates to “Unable to lock a region of a file.” This can happen due to a timeout thread that does the re-arrangement of the policies, that can block the addition or creation of policies due to a read lock, when too many changes are being made at once.
Timer: N/A – Fail..
Yeah, so it sounds like more than one thing is trying to be changed at the same time and it’s not happy. I know, this is not fun for anyone.. Ok, lets throttle it back a bit – quite literally.
Option 2:
Get-DNSServerQueryResolutionPolicy | Remove-DNSServerQueryResolutionPolicy -Force -ThrottleLimit 1
OK, so we start to look at additional parameters for Remove-DNSServerQueryResolutionPolicy and find the -ThrottleLimit parameter that shows the following description:
“Specifies the maximum number of concurrent operations that can be established to run the cmdlet. If this parameter is omitted or a value of 0 is entered, then Windows PowerShell® calculates an optimum throttle limit for the cmdlet based on the number of CIM cmdlets that are running on the computer. The throttle limit applies only to the current cmdlet, not to the session or to the computer.”
So, if you have concurrent operations happening, but some operation needs to be changed before another operation can occur, maybe this will help? Yep, the ThrottleLimit value maintains the order of processing policies into the pipeline, so it essentially serializes the data going through the pipeline (at least in our scenario with the ThrottleLimit value being 1) without the need to collect the data first and then use a foreach loop (as an example), getting all policies and for each one, removing it.
OK, so, let’s see what happens.
Well played…. It now works with no exceptions being thrown! But the excitement wears off pretty quickly, and you’re learning the hard way that unfortunately, now it takes forever. So now instead of throwing exceptions you’re ready to throw your keyboard in frustration. It took 31 hours, 24 minutes and 28 seconds to complete…. That’s no good, so let’s see if we can figure something else out.
Option 3:
Get-DNSServerQueryResolutionPolicy | Sort ProcessingOrder -Descending |
Remove-DNSServerQueryResolutionPolicy -Force
We have now outsmarted the system! But why is this faster??
Well, consider the following. When you do a Get-DNSServerQueryResolutionPolicy you’ll notice that it returns the list sorting it based on the processing order. OK, so what happens if I remove the QRP that is assigned to ‘ProcessingOrder’ 1? Yep, the other 49,999 policies are now moved up in processing order, and then the next policy is removed, which now has ‘ProcessingOrder’ 1, so the remaining 49,998 policies have to have their ‘ProcessingOrder’ modified, and so on, until they’re all gone. OK, so let’s do the math on that for 50,000 QRPs that need to get removed using ‘Option 2′.
After about six hours of using calc.exe, my fingers started to get tired of entering in the numbers into the calculator: 49,999 + 49,998 + 49,997 + 49,996 + 49,995 etc.; So I decided that I’d figure it out with PowerShell instead. (Ok, I may have lied – the thought of entering 50,000 values into the calculator never once crossed my mind.)
$ProcessingOrder = 1..50000
$i = 0
$j = 0
Foreach($Order in $ProcessingOrder){
$i++
$($j += $($ProcessingOrder.count - $i))
}
 
Write-Host "Number of Processing Order Changes that will need to take place with 'Option 2':"
Write-Host "$('{0:N0}' -f $j)" -ForegroundColor Red
Running the snippet of code above will return this:
So, if the calculation that my code does above is correct, then…that’s a lot of changes that need to take place, and that is the answer to why it’s so ridiculously slow.
Option 3 on the other hand removes the policies in reverse processing order, so that no other policies need to have the ‘ProcessingOrder’ value modified for them – I’m sure most admins would prefer this methodology, of only having to make 50 thousand changes (just deletes), vs. just shy of 1.25 BILLION processing order changes…plus the 50 thousand deletes.
Now you may have noticed that I did leave the -ThrottleLimit parameter off of Remove-DNSServerQueryResolutionPolicy so it’s technically still removing more than one at a time, and not necessarily in the exact reverse order, as evident from the Audit log. Option 2, however, was going exactly in ascending order, and it would take on average 5 second to delete each policy in the beginning (based on the audit log timestamps).
Just because it’s you guys though, and I because I know you’re curious, I shimmed in an Option 3.5, just to test the time of doing it in reverse order, but this time setting the throttle limit to 1.
Option 3.5:
Get-DnsServerQueryResolutionPolicy | Sort-Object ProcessingOrder -Descending |
Remove-DnsServerQueryResolutionPolicy -Force -ThrottleLimit 1
Well, it’s not going to win any speed records at the racetrack, but it’s not that bad. I’ll take this over ‘Option 2′ all day long.
But WHAT IF….we could do one better?? WHAT IF, you really had that need for speed? Well team, let me introduce you to the high speed, low drag, ‘Option 4′.
Option 4:
Stop-Service DNS
Get-Item ‘HKLM:SOFTWAREMicrosoftWindows NTCurrentVersionDNS ServerPolicies’ |
Remove-Item -Recurse -Force
Start-Service DNS
From a speed perspective we have a winner…but you have to take the service down, so I’m not a fan of that, but if you have a lot of resolvers, it may not matter to you..
Specs:
For those that are interested, the testing was done on a Windows Server 2019 (Ver 1809) VM that was running on a Windows Server 2019 (Ver 1809) Hyper-V box. The VM has the following hardware configuration:
The Hyper-V host is an old Dell R610.
Side note:  The test scenarios above were more memory intensive than CPU, I just used an existing DNS server that I had laying around and it just happened to have 24 CPUs already on it  .  Regardless of the numbers that I posted for the test scenarios above, everyone’s mileage will vary, and the point of the article wasn’t to give exact numbers, but to show the difference in time that it takes depending on the approach that’s taken to remove the QRPs.
.  Regardless of the numbers that I posted for the test scenarios above, everyone’s mileage will vary, and the point of the article wasn’t to give exact numbers, but to show the difference in time that it takes depending on the approach that’s taken to remove the QRPs.
Until next time..
Disclaimer:
The content above contains a sample script. Sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.
 
					 
Recent Comments