Troubleshooting Azure Stack HCI 23H2 Preview Deployments

Troubleshooting Azure Stack HCI 23H2 Preview Deployments

This article is contributed. See the original author and article here.

Troubleshooting Azure Stack HCI 23H2 Preview Deployments


With Azure Stack HCI release 23H2 preview, there are significant changes to how clusters are deployed, enabling low touch deployments in edge sites. Running these deployments in customer sites or lab environments may require some troubleshooting as kinks in the process are ironed out. This post aims to give guidance on this troubleshooting.


 


The following is written using a rapidly changing preview release, based on field and lab experience. We’re focused on how to start troubleshooting, rather than digging into specific issues you may encounter.


Understanding the deployment process


Deployment is completed in two steps: first, the target environment and configuration are validated, then the validated configuration is applied to the cluster nodes by a deployment. While ideally any issues with the configuration will be caught in validation, this is not always the case. Consequently, you may find yourself working through issues in validation only to also have more issues during deployment to troubleshoot. We’ll start with tips on working through validation issues then move to deployment issues.


When the validation step completes, a ‘deploymentSettings’ sub-resource is created on your HCI cluster Azure resource.


Logs Everywhere!


When you run into errors in validation or deployment the error passed through to the Portal may not have enough information or context to understand exactly what is going on. To get to the details, we frequently need to dig into the log files on the HCI nodes. The validation and deployment processes pull in components used in Azure Stack Hub, resulting in log files in various locations, but most logs are on the seed node (the first node sorted by name).


Viewing Logs on Nodes


When connected to your HCI nodes with Remote Desktop, Notepad is available for opening log files and checking contents. Another useful trick is to use the PowerShell Get-Content command with the -wait parameter to follow a log and -last parameter to show only recent lines. This is especially helpful to watch the CloudDeployment log progress. For example:


Get-Content C:CloudDeploymentLogsCloudDeployment.2024-01-20.14-29-13.0.log -wait -last 150

Log File Locations


The table below describes important log locations and when to look in each:




































Path



Content



When to use…



C:CloudDeploymentLogsCloudDeployment*



Output of deployment operation



This is the primary log to monitor and troubleshoot deployment activity. Look here when a deployment fails or stalls



C:CloudDeploymentLogsEnvironmentValidatorFull*



Output of validation run



When your configuration fails a validation step



C:ECEStoreLCMECELiteLogsInitializeDeploymentService*



Logs related to the Life Cycle Manager (LCM) initial configuration



When you can’t start validation, the LCM service may not have been fully configured



C:ECEStoreMASLogs



PowerShell script transcript for ECE activity



Shows more detail on scripts executed by ECE—this is a good place to look if CloudDeployment shows an error but not enough detail



C:CloudDeploymentLogscluster*
C:WindowsTemp StorageClusterValidationReport*



Cluster validation report



Cluster validation runs when the cluster is created; when validation fails, these logs tell you why



 


Retrying Validations and Deployments


Retrying Validation


In the Portal, you can usually retry validation with the “Try Again…” button. If you are using an ARM template, you can redeploy the template.


In the Validation stage, your node is running a series of scripts and checks to ensure it is ready for deployment. Most of these scripts are part of the modules found here:

C:Program FilesWindowsPowerShellModulesAzStackHci.EnvironmentChecker


 


Sometimes it can be insightful to run the modules individually, with verbose or debug output enabled.


Retrying Deployment


The ‘deploymentSettings’ resource under your cluster contains the configuration to deploy and is used to track the status of your deployment. Sometimes it can be helpful to view this resource; an easy way to do this is to navigate to your Azure Stack HCI cluster in the Portal and append ‘deploymentsettings/default’ after your cluster name in the browser address bar.


 


mtbmsft_4-1705940538192.png


Image 1 – the deploymentSettings Resource in the Portal


From the Portal


In the Portal, if your Deployment stage fails part-way through, you can usually restart the deployment by clicking the ‘Return Deployment’ button under Deployments at the cluster resource.


 

mtbmsft_5-1705940555689.png


Image 2 – access the deployment in the Portal so you can retry


Alternatively, you can navigate to the cluster resource group deployments. Find the deployment matching the name of your cluster and initiate a redeploy using the Redeploy option.


 

mtbmsft_6-1705940579246.png


Image 3 – the ‘Redploy’ button on the deployment view in the Portal


If Azure/the Portal show your deployment as still in progress, you won’t be able to start it again until you cancel it or it fails.


From an ARM Template


To retry a deployment when you used the ARM template approach, just resubmit the deployment. With the ARM template deployment, you submit the same template twice—once with deploymentMode: “Validate” and again with deploymentMode: “Deploy”. If you’re wanting to retry validation, use “Validate” and to retry deployment, use “Deploy”.


mtbmsft_7-1705940600694.png


Image 4 – ARM template showing deploymentMode setting


Locally on the Seed Node


In most cases, you’ll want to initiate deployment, validation, and retries from Azure. This ensures that your deploymentSettings resource is at the same stage as the local deployment.


 


However, in some instances, the deployment status as Azure understands it becomes out of sync with what is going on at the node level, leaving you unable to retry a stuck deployment. For example, Azure has your deploymentSettings status as “Provisioning” but the logs in CloudDeployment show the activity has stopped and/or the ‘LCMAzureStackDeploy’ scheduled task on the seed node is stopped. In this case, you may be able to rerun the deployment by restarting the ‘LCMAzureStackDeploy’ scheduled task on the seed node:

Start-ScheduledTask -TaskName LCMAzureStackDeploy

If this does not work, you may need to delete the deploymentSettings resource and start again. See: The big hammer: full reset.


Advanced Troubleshooting


Invoking Deployment from PowerShell


Although deployment activity has lots of logging, sometimes either you can’t find the right log file or seem to be missing what is causing the failure. In this case, it is sometimes helpful to retry the deployment directly in PowerShell, executing the script which is normally called by the Scheduled Task mentioned above. For example:

C:CloudDeploymentSetupInvoke-CloudDeployment.ps1 -Rerun


Local Group Membership


In a few cases, we’ve found that the local Administrators group membership on the cluster nodes does not get populated with the necessary domain and virtual service account users. The issues this has caused have been difficult to track down through logs, and likely has a root cause which will soon be addressed.

Check group membership with: Get-LocalGroupMember Administrators


Add group membership with: Add-LocalGroupMember Administrators -Member [,…]

Here’s what we expect on a fully deployed cluster:


























Type



Accounts



Comments



Domain Users



DOMAIN



This is the domain account created during AD Prep and specified during deployment



Local Users



AzBuiltInAdmin (renamed from Administrator)


ECEAgentService
HCIOrchestrator



These accounts don’t exist initially but are created at various stages during deployment. Try adding them—if they are not provisioned, you’ll get a message that they don’t exist.



Virtual Service Accounts



S-1-5-80-1219988713-3914384637-3737594822-3995804564-465921127


S-1-5-80-949177806-3234840615-1909846931-1246049756-1561060998


S-1-5-80-2317009167-4205082801-2802610810-1010696306-420449937


S-1-5-80-3388941609-3075472797-4147901968-645516609-2569184705


S-1-5-80-463755303-3006593990-2503049856-378038131-1830149429


S-1-5-80-649204155-2641226149-2469442942-1383527670-4182027938


S-1-5-80-1010727596-2478584333-3586378539-2366980476-4222230103


S-1-5-80-3588018000-3537420344-1342950521-2910154123-3958137386



These are the SIDs of the various virtual service accounts used to run services related to deployment and continued lifecycle management. The SIDs seem to be hard coded, so these can be added any time. When these accounts are missing, there are issues as early as the JEA deployment step.



 


ECEStore


The files in the ECEStore directory show state and status information of the ECE service, which handles some lifecycle and configuration management. The JSON files in this directory may be helpful to troubleshoot stuck states, but most events also seem to be reported in standard logs. The MASLogs directory in the ECEStore directory shows PowerShell transcripts, which can be helpful as well.


NUGET Packages


During initialization, several NuGet packages are downloaded and extracted on the seed node. We’ve seen issues where these packages are incomplete or corrupted—usually noted in the MASLogs directory. In this case, the The big hammer: full reset option seems to be required.


The Big Hammer: Full Reset


If you’ve pulled the last of your hair out, the following steps usually perform a full reset of the environment, while avoiding needing to reinstall the OS and reconfigure networking, etc (the biggest hammer). This is not usually necessary and you don’t want to go through this only to run into the same problem, so spend some time with the other troubleshooting options first.



  1. Uninstall the Arc agents on all nodes with the Remove-AzStackHciArcInitialization command

  2. Delete the deploymentSettings resource in Azure

  3. Delete the cluster resource in Azure

  4. Reboot the seed node

  5. Delete the following directories on the seed node:




    1. C:CloudContent

    2. C:CloudDeployment

    3. C:Deployment

    4. C:DeploymentPackage

    5. C:EceStore

    6. C:NugetStore




  1. Remove the LCMAzureStackStampInformation registry key on the seed node:
    Get-Item -path HKLM:SOFTWAREMicrosoftLCMAzureStackStampInformation | Remove-Item -whatif

  2. Reinitialize Arc on each node with Invoke-AzStackHciArcInitialization and retry the complete deployment


Conclusion


Hopefully this guide has helped you troubleshoot issues with your deployment. Please feel free to comment with additional suggestions or questions and we’ll try to get those incorporated in this post.


 


If you’re still having issues, a Support Case is your next step!

Logic Apps Mission Critical Series: “We Speak: IBM i: COBOL and RPG Applications”

Logic Apps Mission Critical Series: “We Speak: IBM i: COBOL and RPG Applications”

This article is contributed. See the original author and article here.

In this session, we continue with the “We Speak”, Mission Critical Series with an episode on how Azure Logic Apps can unlock scenarios where is required to integrate with IBM i (i Series or former AS/400) Applications.


 


The IBM i In-App Connector


 


The IBM i In-App connector enables connections between Logic App workflows to IBM i Applications running on IBM Power Systems. 


 


hcamposu_0-1705780996126.png


 


 


Background:


 


More than 50 years ago, IBM released the first midrange systems. IBM advertised them as “Small in size, small in price and Big in performance. It is a system for now and for the future”. Over the years, the midranges evolved and became pervasive in medium size businesses or in large enterprises to extend Mainframe environments. Midranges running IBM i (typically Power systems), support TCP/IP and SNA. Host Integration Server supports connecting with midranges using both.


 


IBM i includes the Distributed Program Calls (DPC) server feature that allows most IBM System i applications to interact with clients such as Azure Logic Apps in request-reply fashion (client-initiated only) with minimum modifications. DPC is a documented protocol that supports program to program integration on an IBM System i, which can be accessed easily from client applications using the TCP/IP networking protocol.


 


IBM i Applications were typically built using the Report Program Generator (RPG)  or the COBOL languages. The Azure Logic Apps connector for IBM i supports integrating with both types of programs. The following is a simple RPG program called CDRBANKRPG.


 


hcamposu_1-1705780996162.png


 


 


As with many of our other IBM Mainframe connectors, it is required to prepare an artifact with the metadata of the IBM i programs to call by using the HIS Designer for Logic Apps tool. The HIS Designer will help you creating a Host Integration Design XML file (HIDX) for use with the IBM i connector. The following is a view of the outcome of the HIDX file for the program above.


 


hcamposu_2-1705780996173.png


 


 


For instructions on how to create this metadata artifacts, you can watch this video:


 


 


Once you have the HIDX file ready for deployment, you will need to upload it in the Maps artifacts of your Azure Logic App and then create a workflow and add the IBM 3270 i Connector.


To set up the IBM i Connector, you will require inputs from the midrange Specialist. You will require at least the midrange IP and Port.


 


hcamposu_3-1705780996187.png


 


In the Parameters section, enter the name of the HIDX file. If the HIDX was uploaded to Maps, then it should appear dynamically:


 


hcamposu_4-1705780996189.png


 


 


And then select the method name:


 


hcamposu_5-1705780996191.png


 


 


The following video include a complete demonstration of the use of the IBM i In-App connector for Azure Logic Apps:


 


Master Microsoft Fabric: Your Ultimate Guide to Certification and Expertise

Master Microsoft Fabric: Your Ultimate Guide to Certification and Expertise

This article is contributed. See the original author and article here.

Below, you’ll find a treasure trove of resources to further your learning and engagement with Microsoft Fabric.


 

aitour-homepage.png



Dive Deeper into Microsoft Fabric


 


Microsoft Fabric Learn Together


Join us for expert-guided live sessions! These will cover all necessary modules to ace the DP-600 exam and achieve the Fabric Analytics Engineer Associate certification. 


Explore Learn Together Sessions


Overview: Microsoft Fabric Learn Together is an expert-led live series that provides in-depth walk-throughs covering all the Learn modules to prepare participants for the DP-600 Fabric Analytics Engineer Associate certification. The series consists of 9 episodes delivered in both India and Americas timezones, offering a comprehensive learning experience for those looking to enhance their skills in Fabric Analytics.


Agenda:



  1. Introduction to Microsoft Fabric: An overview of the Fabric platform and its capabilities.

  2. Setting up the Environment: Guidance on preparing the necessary tools and systems for working with Fabric.

  3. Data Ingestion and Management: Best practices for data ingestion and management within the Fabric ecosystem.

  4. Analytics and Insights: Techniques for deriving insights from data using Fabric’s analytics tools.

  5. Security and Compliance: Ensuring data security and compliance with industry standards when using Fabric.

  6. Performance Tuning: Tips for optimizing the performance of Fabric applications.

  7. Troubleshooting: Common issues and troubleshooting techniques for Fabric.

  8. Certification Preparation: Focused sessions on preparing for the DP-600 certification exam.

  9. Q&A and Wrap-up: An interactive session to address any remaining questions and summarize key takeaways.


This series is designed to be interactive, allowing participants to ask questions and engage with experts live. It’s a valuable opportunity for those looking to specialize in Fabric Analytics and gain a recognized certification in the field.


For more detailed information and to register for the series, you can visit the page on Microsoft Learn. Enjoy your learning journey https://aka.ms/learntogether



 


Hands-On Learning with Fabric


Enhance your skills with over 30 interactive, on-demand learning modules tailored for Microsoft Fabric.


Start Your Learning Journey and then participate in our Hack Together: The Microsoft Fabric Global AI Hack – Microsoft Community Hub



Special Offer: Secure a 50% discount voucher for the Microsoft Fabric Exam by completing the Cloud Skills Challenge between January and June 2024.



 


Easy Learning with Fabric Notes


Unlock the power of Microsoft Fabric with engaging, easy-to-understand illustrations. Perfect for all levels of expertise!


Access Fabric Notes Here


 


 

fabricnotes.png



Your Path to Microsoft Fabric Certification


Get ready for DP-600: Implementing Analytics Solutions Using Microsoft Fabric. Start preparing today to become a certified Microsoft Fabric practitioner.


 


Join the Microsoft Fabric Community


Connect with fellow Fabric enthusiasts and experts. Your one-stop community hub: https://community.fabric.microsoft.com/. Here’s what you’ll find:



 


Stay Ahead: The Future of Microsoft Fabric


Be in the know with the latest developments and upcoming features. Check out the public roadmap

Tell Us What You Think!

This article is contributed. See the original author and article here.


Hello Azure Communication Services users!



As we enter 2024, we’d like to take the opportunity to hear what you think of the Azure Communication Services platform. We’d love to hear your insights and feedback on what you think we’re doing well and where you think we have an opportunity to better meet your needs. We’d really appreciate it if you would take 5-7 minutes to complete our survey HERE and share your thoughts with us. We’ll use this information to help guide future development, and to help us focus on the areas that our customers tell us are most important to them.

 

Please note – This survey is specifically designed for developers who’ve built something (even a demo or sample) with 

Azure Communication Services. We will offer additional opportunities for other users to share their feedback as well.



That survey link, again, is HERE. Thanks for your feedback, and here’s to a productive and successful 2024!



Viva People Science Industry Trends: Retail

Viva People Science Industry Trends: Retail

This article is contributed. See the original author and article here.

Welcome to the fourth edition of Microsoft Viva People Science industry trends, where the Viva People Science team share learnings from customers across a range of different industries. Drawing on data spanning over 150 countries, 10 million employees, and millions of survey comments, we uncover the unique employee experience challenges and best practices for each industry. 


 


In this blog, @Jamie_Cunningham and I share our insights on the state of employee engagement in the retail industry. You can also access the recording from our recent live webinar, where we discussed this topic in depth.  


 


Let’s first look at what’s impacting the retail industry today. In summary, we are hearing about market volatility, supply chain constraints, changing consumer behavior, technological advancements, labor pressures, and rising costs. According to the Deloitte Retail Trends 2023 report, the top-of-mind issues for retail leaders are: 


 



  • Growth versus sustainability: Retailers need to balance the short-term pressures of profitability and cash flow with the long-term goals of environmental and social responsibility. 

  • Consumer confidence and retail sales: Retailers need to cope with the uncertain and volatile consumer demand, which is influenced by factors such as inflation, health concerns, and government policies. 

  • Leadership quality and brand strength: Retailers need to demonstrate strong and visionary leadership, as well as to build and maintain a distinctive and trusted brand identity. 

  • Technological innovation: Retailers need to leverage technology and data to create personalized, seamless, and omnichannel customer experiences, as well as to optimize their operations and supply chains. 


 


These issues require retailers to be agile, resilient, and innovative in their employee experience strategies and execution. The retail industry also faces some specific challenges in attracting and retaining talent, such as: 


 



  • Rewards: Retail jobs often pay comparatively lower wages and benefits to other industries and can lack recognition and rewards for employees’ hard work.  

  • Wellbeing: Retail employees often deal with high-stress, low-flexibility, and high-risk work environments, which can affect their physical and mental health. 

  • Growth: Retail employees often perceive limited opportunities for career advancement, skill development, and learning, which can lead to disengagement and attrition. 


 


According to Glint benchmark data (2023), employee engagement in retail has declined by two points between 2021 and 2022. It’s clear that retailers need to invest in improving the employee experience, especially for the frontline workers, who are the face of the brand and the key to customer loyalty. So, how do they do this? Here are three examples of how retailers we’ve worked with have addressed the needs of their employees with the support of Microsoft Viva: 


 


1. Create a compelling future 


 


We worked with the leadership team of a MENA (Middle East and North Africa) based retailer to recognize that there was a connection between their ability to communicate the future of the direction of the organization effectively, and the degree to which employees saw a future for themselves in the organization. The team committed to clarifying how the business initiatives they were rolling out connected to future work opportunities for their teams. 


 


2. Build bridges with frontline employees 


 


According to the Microsoft Work Trend Report (2022), sixty-three percent of all frontline workers say messages from leadership don’t make it to them. A global fashion brand recognised after several years of employee listening that the actions being taken by leadership were not being felt on the shop floor. We worked with them to adopt a simplified action taking model with one clear commitment from leaders, that was efficient and effective in terms of communication and adoption. They also increased their investment in manager enablement to support better conversations within teams, when results from Viva Glint were released. This simplified approach led to improved perceptions of the listening process, and greater clarity at all levels on where to focus for a positive employee experience. 


 


3. One internal team, one goal 


 


Through an Executive Consultation with leaders of a UK retailer, it was identified that wellbeing was a risk for the business that unless addressed, would severely impact their priorities. With that in mind, the team created internal alignment – to prioritise wellbeing through both training investment and policy changes, resulting in a thirteen-point improvement in the wellbeing score year over year. 


 


Conclusions 


 


To succeed in this dynamic and competitive market, retailers need to focus on their most valuable asset: their employees. By investing in the employee experience, especially for the frontline workers, retailers can boost their employee engagement, customer satisfaction, and business performance. 


 


A downloadable one-page summary is also available with this blog for you to share with your colleagues and leaders. 


 


Leave a comment below to let us know if this resonates with what you are seeing with your employees in this industry. 


 


EmilyPerina_0-1705509572732.png


 


 


References: 


Deloitte retail trends report (2023) 


Microsoft Work Trend Index special report (2022)