by Contributed | Jan 20, 2023 | Technology
This article is contributed. See the original author and article here.
How to Use TSSv2 to Collect Data and Analyze to Solve High CPU Issues.
Hello everyone, this is Denzel Maxey with the Windows Performance Team. I found a tool that actively collects different data based on scenarios and streamlines the data collection process. Drumroll – introducing TSSv2 (Troubleshooting Support Script). In my job, I see a lot of High CPU cases and collecting an ETL trace using TSSv2 with Xperf aka WPR for high CPU has been fundamental in resolving issues.
I’d like to share some instructions, methods, and just insight on the tools in general that should be able to empower IT professionals resolve issues. This post will show how the TSSv2 tool can work with the Windows Performance Recorder. Tssv2 also works with several tools as it is very powerful but will focus on collecting a WPR trace using TSSv2 when regarding a case of High CPU. I can even give you a great clue as to how to collect data for Intermittent High CPU cases as well! Once you have the data, I’ll then show you how to analyze it. Lastly, I’ll provide some additional resources on WPA Analysis for High CPU.
Data Collection Tools:
TSSv2
TSSv2 (TroubleShootingScript Version 2) is a code signed, PowerShell based Tool and Framework for rapid flexible data collection with a goal to resolve customer support cases in the most efficient and secure way. TSSv2 offers an extensible framework for developers and engineers to incorporate their specific tracing scenarios.
WPR/Xperf
“Windows Performance Recorder (WPR) is a performance recording tool that is based on Event Tracing for Windows (ETW). The command line version is built into Windows 10 and later (Server 2016 and later). It records system and application events that you can then analyze by using Windows Performance Analyzer (WPA). You can use WPR together with Windows Performance Analyzer (WPA) to investigate particular areas of performance and to gain an overall understanding of resource consumption.”
*Xperf is strictly a command line tool, and it can be used interchangeably with the WPR tool.*
_________________________________________________________________________________________________________________________________________________
Let’s Dig in!
You notice your server or device is running with 90% high CPU. Your users are complaining of latency and poor performance. You have checked task manager, resource monitor or even downloaded and opened process explorer but there is still no exact root resource glaring you in the face. No worries, a WPR will break down the high CPU processes a bit more. You could even skip straight to this step in the future when you get comfortable working with these tools.
Setup TSSv2
Running a TSSv2 troubleshooting script with the parameters for either WPR or Xperf gathers some granular Performance data on machines showing the issue. In the example below, I’m saving the TSSv2 the script to D: (note the default data location is c:MS_Data). In your web browser, download TSSv2.zip found here: http://aka.ms/getTSSv2 or you can Open an Administrative PowerShell Prompt and paste the following commands.
The commands below will automatically prepare the machine to run TSSv2 by taking the following actions in the given order:
- Create D:TSSv2 folder
- Set the PowerShell script execution policy to RemoteSigned for the Process level (process level changes only affect the current PowerShell window)
- Set TLS type to 1.2 and download the TSSv2 zip file from Microsoft
- Expand the TSSv2.zip file into D:TSSv2 folder
- Change to D:TSSv2 folder
Ex: Commands used below
md D:TSSv2
Set-ExecutionPolicy -scope Process -ExecutionPolicy RemoteSigned -Force
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
Start-BitsTransfer https://aka.ms/getTSSv2 -Destination D:TSSv2TSSv2.zip
Expand-Archive -LiteralPath D:TSSv2TSSv2.zip -DestinationPath D:TSSv2 -force
cd D:TSSv2
The result will be a folder named TSSv2 on drive D.

_________________________________________________________________________________________________________________________________________________
Gathering Data using TSSv2
Open an elevated PowerShell window (or start PowerShell with elevated privileges) and change the directory to this folder:
cd D:TSSv2
*WARNING* Data collection grows rather large quickly. You should have at least 30% of your overall RAM available as hard drive space. (Example, if you have 8 GB of RAM – then the file can grow to 2.5GB or larger in c:MS_Data.)
What are some of the scenarios you might have? Maybe you want to manually collect the trace. Or, once you start the trace, let it automatically stop.
How about limiting the file size? There are several parameters you can adjust for your needs.
Below you will find variations of using TSSv2 to collect WPR data in high CPU occurrences. You have an option of using either WPR or Xperf commands. Please review all of them before deciding which trace to take for your environment.
- Scenario In State: The issue is currently occurring, and the following example needs user intervention to stop the trace. The WPR can grow to 80% of the memory with the example commands listed below.
.Tssv2.ps1 -WPR General *** (run it for 60 seconds to no longer than 3 minutes)
.Tssv2.ps1 -Xperf CPU ***(run it for 60 seconds to no longer than 3 minutes)
Default location of saved data will be C:MS_Data.
The prompt will tell you when to reproduce the issue, just simply enter “Y” will END the trace at that time and the machine in question experiencing high CPU will then finish running the data collection.

2. Scenario In State but you wanted to limit the size and length of time: The issue is currently occurring, the following example does NOT need user intervention to stop the trace. Default location of saved data will be C:MS_Data. The Xperf can grow to 4GB of memory and runs for 5 minutes with the setting below:
The issue is currently occurring; the following example does NOT need user intervention to stop the trace. Default location of saved data will be C:MS_Data. The Xperf can grow to 4GB of memory and runs for 5 minutes with the setting below:
.TSSv2.ps1 -Xperf CPU -XperfMaxFileMB 4096 -StopWaitTimeInSec 300
Note: you can modify the size and length of the trace by increasing or decreasing -XperfMaxFileMB and -StopWaitTimeInSec when it is initially run.
3. Scenario In State but you wanted to limit the size and length of time with data saved on Z:Data drive instead of C (default): The issue is currently occurring; the following example does NOT need user intervention to stop the trace. The Xperf can grow to 4GB of the memory and runs for 5 minutes with the setting below and this time the resulting data will be saved on Z:Data. You simply need to add -LogFolderPath Z:Data to the command.
.TSSv2.ps1 -Xperf CPU -XperfMaxFileMB 4096 -StopWaitTimeInSec 300 -LogFolderPath Z:Data
4. Scenario Intermittent High CPU and having a tough time capturing data. These commands will wait for the CPU to reach 90%, start a trace and will stop the file from growing larger than 4GB while running for 5 minutes.
.TSSv2.ps1 -Xperf CPU -WaitEvent HighCPU:90 -XperfMaxFileMB 4096 -StopWaitTimeInSec 300
5. Scenario Intermittent High CPU and having a tough time capturing data. These commands will wait for the CPU to reach 90%, start a trace and will stop the file from growing larger than 4GB while running for 100 seconds (1.5 minutes roughly).
.TSSv2.ps1 -Xperf CPU -WaitEvent HighCPU:90 -StopWaitTimeInSec 100
Pro Tip: You can check for additional Xperf/WPR commands by doing a search on the help command files in TSSv2 by typing
.tssv2.ps1 -help at the prompt. When prompted to enter a number or keyword, type xperf or wpr, hit enter, and you will see the options.
Ex: Finding help with keyword ‘xperf’

Be sure to wait for the TSS script to finish, it can take some time (even an hour to finish writing out). The PowerShell will return to type line and the folder in C:MS_Data should zip itself when complete. The location of the script does not determine the location of the data collected. Wait for trace to finish before exiting PowerShell.
Reminder: Just like in the first trace, you learned data collection grows rather large quickly. You should have at least 30% of your overall RAM available as hard drive space. (Example, if you have 8 GB of RAM – then the file can grow to 2.5GB or larger on c:MS_Data.)
_________________________________________________________________________________________________________________________________________________
You have the Data – Now Let’s look at it!
Download the Windows ADK (Windows Assessment and Deployment Kit) from this location: Download and install the Windows ADK | Microsoft Learn. Once you download the Windows ADK, you want to install the Windows Performance Toolkit. Double click on the executable (.exe) to start the installation process.
Uncheck everything except Windows Performance Toolkit, then click Install.

Opening the data in the C:MS_DATA folder
When complete, the WPR general TSSv2 command should have placed all collected data into this folder in a zipped file. You will know the trace ran all the way without stopping prematurely when you see the zipped file in C:MS_DATA. There will also be a message in the PowerShell window when the diagnostic completes stating the name of and the location of the zipped file.

You will need to unzip the zipped file to analyze the WPR trace (.etl file). After unzipping, you will see several data collections that can be helpful with analysis. However, what you mainly want to look at is the .etl file which is usually the biggest file located in the folder.

If you double click the .ETL file it should open in WPA, but if not, you can manually open the newly installed application and navigate to your file.
Example:

You can open the .ETL file to view the WPR trace with WPA (Windows Performance Analyzer) by clicking File, Open and then browsing to the file that ends with the .ETL extension.
Step 1. Open WPR trace in WPA and load the Public Symbols. You may also see symbols listed from the NGEN folder (NGEN is part of the folder name) collected at the time the WPR trace was run.
Select Trace, select Configure Symbol Paths

Click + sign (highlighted in yellow in screenshot below), then enter Public Symbol Path: srv*c:symbols*https://msdl.microsoft.com/download/symbols

More Information: (Symbol path for Windows debuggers – Windows drivers | Microsoft Learn)
Once symbols are configured simply click Load Symbols

Step 2. Once open you should see a window similar to the screenshot below. Expand Computation on the left and drag CPU Usage (Precise) to the right side of the Window to load. You can also double click CPU Usage (Precise) for it appear on the right side.

You will then see space on the top graph showing, “Trace Rundown”. That is not needed as it is the part of the trace where the script was finishing up. To get rid of the trace rundown, highlight the area before trace rundown, right click, then select “Zoom”.

You can now filter down each of your processes deeper and deeper to try to locate a potential root cause of what is spiking the CPU. You can look to see which processes have the highest weight over on the right-hand columns to help pinpoint the highest consumers. It may be a specific kernel driver, application, process, etc. but this should help point you in the right direction of what process is exhausting resources.
These are the columns you will want to focus on:
Left of Gold Bar:
New Process
New Thread ID
New Thread Stack
Right of Gold Bar:
Waits (us) Sum
Waits (us) Max
Count:Waits Sum
%CPU Usage Sum

You can see the CPU usage is the highest due to CPUSTRESS.EXE in this example. As you filter down you can see the threads that contribute to the max CPU spike which sums up to the top number of the CPU usage. This can be helpful to find out which threads, functions and modules are called for the root cause.
Conclusion:
Once again this is not the only use for the TSSv2 tool. But as you can see, the WPR/Xperf trace is a very complex tool that gathers data from a simple PowerShell command. This can be very efficient for troubleshooting. This article is not meant to cover all scenarios. However, I highly recommend taking some time to learn more about what TSSv2 can accomplish as this tool will only continue to get better.
If at any point you get stuck don’t hesitate to open a support case with Microsoft.
Additional Information:
Information on TSSv2 and alternative download site:
https://docs.microsoft.com/en-us/troubleshoot/windows-client/windows-troubleshooters/introduction-to-troubleshootingscript-toolset-tssv2
Information about Windows Performance Toolkit
Windows Performance Toolkit | Microsoft Learn
For Reference:
Download Windows Assessment Toolkit which contains Windows Performance Analyzer
Download and install the Windows ADK | Microsoft Learn
How to setup public symbols
Symbol path for Windows debuggers – Windows drivers | Microsoft Learn
by Contributed | Jan 18, 2023 | Technology
This article is contributed. See the original author and article here.
Postgres is one of the most widely used databases and supports a number of operating systems. When you are writing code for PostgreSQL, it’s easy to test your changes locally, but it can be cumbersome to test it on all operating systems. A lot of times, you may encounter failures across platforms and it can get confusing to move forward while debugging. To make the dev/test process easier for you, you can use the Postgres CI.
When you test your changes on CI and see it fail, how do you proceed to debug from there? As a part of our work in the open source Postgres team at Microsoft, we often run into CI failures—and more often than not, the bug is not obvious, and requires further digging into.
In this blog post, you’ll learn about techniques you can use to debug PostgreSQL CI failures faster. We’ll be discussing these 4 tips in detail:
Before diving into each of these tips, let’s discuss some basics about how Postgres CI works.

Introduction to the PostgreSQL CI
PostgreSQL uses Cirrus CI for its continuous integration testing. To use it for your changes, Cirrus CI should be enabled on your GitHub fork. The details on how to do this are in my colleague Melih Mutlu’s blog post about how to enable the Postgres CI. When a commit is pushed after enabling CI; you can track and see the results of the CI run on the Cirrus CI website. You can also track it in the “Checks” GitHub tab.
Cirrus CI works by reading a .cirrus.yml file from the Postgres codebase to understand the configuration with which a test should be run. Before we discuss how to make changes to this file to debug further, let’s understand its basic structure:
# A sequence of instructions to execute and
# an execution environment to execute these instructions in
task:
# Name of the CI task
name: Postgres CI Blog Post
# Container where CI will run
container:
# Container configuration
image: debian:latest
cpu: 4
memory: 12G
# Where environment variables are configured
env:
POST_TYPE: blog
FILE_NAME: blog.txt
# {script_name}_script: Instruction to execute commands
print_post_type_script:
# command to run at script instruction
- echo "Will print POST_TYPE to the file"
- echo "This post's type is ${POST_TYPE}" > ${FILE_NAME}
# {artifacts_name}_artifacts: Instruction to store files and expose them in the UI for downloading later
blog_artifacts:
# Path of files which should be relative to Cirrus CI’s working directory
paths:
- "${FILE_NAME}"
# Type of the files that will be stored
type: text/plain
Figure 1: Screenshot of the Cirrus CI task run page. You can see that it run script and artifacts instructions correctly.
Figure 2: Screenshot of the log file on Cirrus CI. The gathered log file is uploaded to the Cirrus CI.
As you can see, the echo commands are run at script instruction. Environment variables are configured and used in the same script instruction. Lastly, the blog.txt file is gathered and uploaded to Cirrus CI. Now that we understand basic structure, let’s discuss some tips you can follow when you see CI failures.
Tip #1: Connect to the CI environment with a terminal
When Postgres is working on your local machine but you see failures on CI, it’s generally helpful to connect to the environment where it fails and check what is wrong.
You can achieve easily that using the RE-RUN with terminal button on the CI. Also, typically, a CI run can take time as it needs to find available resources to start and rerun instructions. However, thanks to this option, that time is saved as the resources are already allocated.
After the CI’s task run is finished, there is a RE-RUN button on the task’s page.
Figure 3: There is an arrow on the right of the RE-RUN button, if you press it the “Re-Run with Terminal Access” button will appear.
You may not have noticed it before, but there is a small arrow on the right of the RE-RUN button. When you click this arrow, the “Re-Run with Terminal Access” button will appear. When this button is clicked, the task will start to re-run and shortly after you will see the Cirrus terminal. With the help of this terminal, you can run commands on the CI environment where your task is running. You can get information from the environment, change configurations and re-test your task.
Note that the re-run with terminal option is not available for Windows yet, but there is ongoing work to support it.
Tip #2: Enable build-time debug options and use them on CI
Postgres and meson provide additional build-time debug options to generate more information to find the root cause of certain types of errors. Some examples of build options which might be useful to set are:
-Dcassert=true [defaults to false]: Turns on various assertion checks. This is a debugging aid. If you are experiencing strange problems or crashes you might want to turn this on, as it might expose programming mistakes.
-Dbuildtype=debug [defaults to debug]: Turns on basic warnings and debug information and disables compiler optimizations.
-Dwerror=true [defaults to false]: Treat warnings as errors.
-Derrorlogs=true [defaults to true]: Whether to print the logs from failing tests.
While building Postgres with meson, these options can be setup using the meson setup [] [] or the meson configure commands.
These options can either be enabled with the “re-running with terminal access” option or by editing the cirrus.yml config file. Cirrus CI has a script instruction in the .cirrus.yml file to execute a script. These debug options could be added to the script instructions in which meson is configured. For example:
configure_script: |
su postgres <<-EOF
meson setup
-Dbuildtype=debug
-Dwerror=true
-Derrorlogs=true
-Dcassert=true
${LINUX_MESON_FEATURES}
-DPG_TEST_EXTRA="$PG_TEST_EXTRA"
build
EOF
Once it’s written as such, the debug options will be activated next time CI runs. Then, you can check again if the build fails and investigate the logs in a more detailed manner. You may also want to store these logs to work on them later. To gather the logs and store them, you can follow the tip below.
Tip #3: Gathering Postgres logs and other files from CI runs
Cirrus CI has an artifact instruction to store files and expose them in the UI for downloading later. This can be useful for analyzing test or debug output offline. By default, Postgres’ CI configuration gathers log, diff, regress log, and meson’s build files—as can be seen below:
testrun_artifacts:
paths:
- "build*/testrun/**/*.log"
- "build*/testrun/**/*.diffs"
- "build*/testrun/**/regress_log_*"
type: text/plain
meson_log_artifacts:
path: "build*/meson
If there are other files that need to be gathered, another artifact instruction could be written or the current artifact instruction could be updated at the .cirrus.yml file. For example, if you want to collect the docs to review or share with others offline, you can add the instructions below to the task in the .cirrus.yml file.
configure_script: su postgres -c 'meson setup build'
build_docs_script: |
su postgres <<-EOF
cd build
ninja docs
EOF
docs_artifacts:
path: build/doc/src/sgml/html/*.html
type: text/html
Then, collected logs will be available in the Cirrus CI website in html format.
Figure 4: Screenshot of the uploaded logs on the Cirrus CI task run page. Logs are uploaded to the Cirrus CI and reachable from the task run page.
Tip #4: Running specific commands on failure
Apart from the tips mentioned above, here is another tip you might find helpful. At times, we want to run some commands only when we come across a failure. This might be to avoid unnecessary logging and make CI runs faster for successful builds. For example, you may want to gather the logs and stack traces only when there is a test failure. The on_failure instruction helps to run certain commands only in case of an error.
on_failure:
testrun_artifacts:
paths:
- "build*/testrun/**/*.log"
- "build*/testrun/**/*.diffs"
- "build*/testrun/**/regress_log_*"
type: text/plain
meson_log_artifacts:
path: "build*/meson-logs/*.txt"
type: text/plain
As an example, in the above, the logs are gathered only in case of a failure.
Making Postgres Debugging Easier with CI
While working on multi-platform databases like Postgres, debugging issues can often be difficult. Postgres CI makes it easier to catch and solve errors since you can work on and test your changes on various settings and platforms. In fact, Postgres automatically runs CI on every commitfest entry via Cfbot to catch errors and report them.
These 4 tips for debugging CI failures should help you speed up your dev/test workflows as you develop Postgres. Remember to use the terminal to connect CI environment, gather logs and files from CI runs, use build options on CI, and run specific commands on failure. I hope these tips will make Postgres development easier for you!
by Contributed | Jan 17, 2023 | Technology
This article is contributed. See the original author and article here.
Authentication is a key step in the user journey of any application. Going about designing the authentication flow can be confusing and not straightforward. When load testing an application, this generally is the first step in the user journey. Supplying client credentials through a UI is not possible when load testing an application, so is evaluating how to implement specific authentication flows available on Azure, as they can be tedious and time consuming as well.
Within this series, we will cover the authentication flows and scenarios that are possible with Azure Active Directory (Azure AD) as the identity provider.
At the end of the blog, you will be able to
- Use Azure AD to Authenticate a web application hosted on Azure App Service using the client credential grant flow.
- Parametrize the client credentials in JMeter to retrieve them at run-time in Azure Load Testing.
Prerequisites
- A webapp with authentication enabled with Azure AD.
- An Azure Load Testing resource.
- Azure key vault for storing secrets.
- Azure Load Testing resource configured to fetch the secrets during runtime. Visit here to learn how to do it.
- JMeter
Authenticating to your web app with a shared secret
When you are using a shared secret to authenticate to an application on, you essentially pose yourself as a trusted principal with a valid token that can be used to authenticate you to the application which is registered with azure active directory. The token helps establish a trust, that you can access and make modifications to the resource (application).
- To get the access token from Azure AD, we need to pass 4 parameters to get the access token:
- client_id
- client_secret
- grant_type
- and the tenant_id
For more information you can see authentication using shared secret
- Retrieve client_id, tenant_id for the app registered with Azure AD by going to Azure Active Directory >>App Registrations >>Overview on the azure portal.
- Retrieve the client_secret for the app by clicking on Certificate & secrets >> Client Secrets
The best practice is to store the above parameters into Azure Key Vault and then fetch them directly at runtime instead of hard coding them into the script.
Fetching the client secret
Configuring the JMeter test plan
The JMeter test plan needs to be configured to make a request to the app’s authentication endpoint to acquire the token. The endpoint can be found by visiting Azure Portal and navigating to Azure Active Directory > App registrations > > Endpoints
Getting the Authentication endpoint
It would look something as below:
https://login.microsoftonline.com//oauth2/token
For the allowed values of you may refer to issuer values. In our case, it would be the tenant id.
Once we have the token, we can pass it to the subsequent requests in the authorization header to authenticate to the application.
Now that we know what needs to be done, let’s start implementing it.
Creating the test plan in the JMeter GUI
- Start by adding two thread groups, (Authentication) one for fetching the bearer token and the other (Application) to access the landing page of the application.
- Add an environment variable element to the Authentication thread group. This environment variable will be used to fetch the values of fields like client_id, client_secret and tenant_id which we stored earlier in the key vault at runtime to help acquire the access token.
Defining used defined variable
- Add a child HTTP request sampler (Token Request) to the Authentication thread group. Within this HTTP request we will setup a post method that will help retrieve the access token.
Defining the POST method to get access token
- Add two child post processor elements to the Token Request sampler, one JSON Extractor (Extract Auth Token) for extracting the token. The response from the Token Request HTTP sampler comes back as a JSON response and we extract the token using the expression $.access_token .
Extracting Authentication token
- The next post processor element would be JSR223(Set AuthToken), which will be used to set the token extracted as a property named access_token. Setting it as a property will allow the variable to be accessible globally across samplers and hence can be accessed by the next thread group.
Setting property as an access token property
- Next, let’s configure the application landing page (Homepage) to access the application homepage. Add a child element a header manager, to configure and maintain the header to be passed with the request. In this case we only pass the authorization header that would contain the bearer token obtained from the previous thread group (Authentication).
Configuring the header manager
Creating and Running the Load Test
Once we have setup our JMeter test plan, now we can move ahead and run the same using the azure load testing service by creating a test, supplying the above created JMeter script as the test plan and configuring the environment variables.
- Supply the JMeter test plan (JMX file) we created in the previous section.
Configuring the Test Plan
- Configure the Secrets section within the Parameters tab. We have stored all the sensitive information in the key vault. We would need to configure our tests to fetch those at runtime. Visit how to parameterize load tests to know more.
Configuring the secrets
Try this out and let us know if it works for you. Please use the comments section to help us with any feedback around this scenario and anything you would like to see next time.
If you have any feedback on Azure Load Testing, let us know using our feedback forum.
Happy Load Testing!!!
Recent Comments