This article is contributed. See the original author and article here.

A common challenge for app developers, site reliability engineers (SREs), and DevOps engineers is that a synthetic availability test could fail while the application is still functioning perfectly. It can be extremely frustrating to identify if the root cause of the failure was due to your application or network issues.


 


Introducing the new Availability Troubleshooting Report


 


TroubleshooterGif.gif


 

NOTE: The troubleshooting report is only available for URL ping tests.

 


The Troubleshooting Report is intended to help you understand why your customers may have problems accessing your application or alert you to potential issues while all metrics indicate it is healthy.


 


It can be accessed through the portal by  selecting a test result from the scatter plot or Drill Into section. Each dependency will have an individual troubleshooting report attached.


 


casocha_0-1610049088239.png


If a step fails, then it will appear at the top of the availability result to give you instant insight into where the problem might be. If no step fails, then the troubleshooting report will be closed by default.


 


Common Test Failures & Potential Root Causes:


 

 

DNS.png


 

DNS lookup could fail because your record needs to be publicly available for the ping test to work.


 


If you need to test against a private DNS record, then use the TrackAvailability SDK. This enables you to run availability tests behind a firewall or in an isolated environment, expand your test region selection, and author more complex tests than are available in the portal UI.


 

ConnectionFailed.png


 


Connection Failed indicates that there might be a firewall blocking our service from accessing your endpoints.


 


You can add the Application Insights Availability service tag to your Network Security Group (NSG) or Azure Firewall to allow only inbound traffic from our testing engine. Service tags will automatically update the list of allowed IP addresses for specific services, minimizing the complexity and need for updating network security rules. You can also whitelist by individual IP addresses.


 


If you need to run tests without allowing any traffic into your virtual network, then we recommend using the TrackAvailability SDK.


 


StatusCode.png


 


Status Code & Content Validation ensures your webpage has specific content available and that it sends the correct response code.


 


The application owners should be contacted to investigate why their page returns an incorrect code or is missing content.


 


See more:


Troubleshoot your Azure Application Insights availability tests – Azure Monitor | Microsoft Docs


 


 

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.