How to apply a hierarchy of edge devices at scale to ISA-95 IIoT environments

This article is contributed. See the original author and article here.

Using a hierarchy of IoT Edge devices in manufacturing scenarios lets customers keep their existing infrastructure but at the same time customers can benefit from being able to fully address, manage and control the IoT edge devices on the factory floor, even if they are completely disconnected from the Internet.

This helps to address critical challenges in the implementation of Industrial IoT such as the balance between security and business value, as well as how to effectively implement new technology within the context of a legacy infrastructure.

One of the established concepts in operational technology is the ISA-95 standard and the Purdue model of control. These define the models, interactions and relationships between systems in well understood layers. These layers are in turn used to define the network infrastructure, with strict interfaces between layers controlled by firewalls.

Azure IoT Edge in a nested hierarchy is ideally placed to address many of these requirements and help you to build secure and manageable edge solutions within established structures.

https://www.youtube-nocookie.com/embed/KpoNALjeoOw?controls=0&autoplay=false&WT.mc_id=iot-c9-niner

Let us move beyond the Tutorials to create a hierarchy of IoT Edge devices and answer the following questions:

Which steps and configurations should you carry out to make your IIoT edge solution production ready?

How to secure communications between edge devices and the IoT Hub

What are best practices to scale your deployments across a fleet of devices and keep your configurations clear and manageable.

What are the extra steps needed to deploy a hierarchy of IoT Edge devices with 3 or more layers?

How would you manage and control lower-level disconnected devices?

How can I control what happens when connectivity between layers or the cloud is lost?

To make this real we have set up a sample configuration of 3 physical IoT Edge devices in a nested configuration defined as:

“Level 3 – Lower Layer Device”, “Level 4 – Lower Layer Gateway” and “Level 5 – Top Layer Gateway”.

To give you a glimpse of the configuration we have built and how it maps to the ISA-95 levels, check out this diagram:

You can find a step-by-step tutorial how to enable and use all of these best practices on this GitHub. Make sure to check it out.

Getting your IIoT edge solution production ready

Most of the best practice for implementing edge solutions for industrial IoT applies, whether you are using a nested configuration or not. Introductory reading here is the excellent checklist for getting your IoT edge instance ready for production.

On top of that, we would advise you to use a set of best practices that the Azure IoT Services have to offer, namely:

Securing communications
- Use Root CA X.509 certificates to authenticate and secure communications between parent and child devices, as well as to the IoT Hub.
- Use existing company public key infrastructure if possible, for managing your certificates with dedicated intermediate certificates for signing the edge certificates.
- Consider a separate intermediate certificate outside the enterprise chain of trust to sign the edge CA certificates.

Deployment management
- Tag your devices in the Device Twins to identify a target group of devices via queries
- Use automatic deployments and jobs to apply the same configuration to a set of devices
- Use layered deployments to extend the capabilities of some devices without having to change configurations for all devices in the fleet

Connectivity resilience
- Store and forward messages if the devices are separated from the network on the device using the Time To Live Setting in Device Twin

Certificates for a hierarchy of edge devices

To establish communication and authentication between the devices and the cloud, Industrial IoT companies often prefer using certificate-based authentication instead of using symmetric keys. We also recommend using X.509 based authentication with a chain extending to a trusted root CA for added security in productive environments.

On top of using a certificate for the authentication to the cloud, you would be required to use certificates to establish trust between the parent and child devices for transparent gateway operations.

Even in a nested scenario, every IoT Edge device needs to have its own unique device identity in the IoT Hub. Crucially, this means that you would usually generate two sets of certificates for each device:

An optional device client certificate and key for each device to identify and authenticate the device to the IoT Hub. Alternatively, this could be a symmetric key and the user is free to mix authentication modes within the nested hierarchy.

A required signing CA certificate, so that the edge can sign its own workload client certificates. These are used to secure connections between the different components on an IoT Edge device or between an IoT Edge device and any leaf devices in a nested hierarchy. This concept is further explained in our documentation.

If you are unsure about the terminology, check out this blogpost: The layman’s guide to X.509 certificate jargon

On the layer 4 device, the respective part of the files would look like the following. Look closely at the naming differences to understand the different certificate sets:

If you need more details how to apply the certificates to the devices and to IoT Hub, check out the GitHub.

Using layered deployments to roll out configurations at scale

Instead of setting the modules and deployment manifests manually for each IoT Edge device, it is recommended to use automatic deployments for a fleet of devices and especially in production scenarios. In a nested edge scenario, we have seen particular value in the use of layered deployments to fine tune workloads.

Deployments are rolled out to device populations by matching user defined target conditions, for example device twin properties or tags, in a strict priority order.

Layered Deployments can be used to expand the deployment to include additional modules or override configurations on top of a base deployment manifest.

Tags are very useful to group a set of devices by using the Device Twin or Module Twin of IoT Edge Devices.

Our devices have the following tags set to true in their device twins.

Level 5 – Top Layer Gateway

Level 4 – Lower Layer Gateway

Level 3 – Lower Layer Device

Tag Names

topnestededge

tempsensor

nestededge

nestedgateway

tempsensor

nestededge

tempsensor

In detail we have the following configuration:

One base deployment manifest for the top layer Gateway (layer 5) for devices where topnestededge=true. This is a basic deployment manifest with the system modules and the API proxy module needed for nested operations. Since this has an internet connection we get obtain images directly from the Microsoft Container Registry (MCR).

One layered deployment manifest for the layer 5 edge devices to deploy the Simulated Temperature also from the MCR. This targets device with both tags.tempsensor=true AND tags.topnestededge=true

One base deployment manifest for the lower layer devices (layer 3 + 4) with tag nestededge. This only contains the system modules with an image that points to $upstream. This tells the edge runtime to pull images from the parent edge device.

One layered deployment manifest for the lower layer 4 Edge Device to deploy the IoTEdgeAPIProxy from $upstream. This applies to all devices with tag nestedgateway.

One layered deployment manifest for the lower layer 3 + 4 Edge Devices to deploy the Simulated Temperature Sensor from $upstream.

In IoT Hub, all deployments together look like the following:

Some other examples of layered deployments are also available in the GitHub.

Use the Module Twin or Direct Methods to send commands and desired configurations to lower-layer devices

As we have created IoT Hub Device Identities for all devices, we are able to control these devices from the cloud directly.

If we would like to update the Module Twin of our Level 3 device, for example, we would change the desired property for the respective module to enable it.

On top of that, we can also send direct methods directly from the cloud to a lower layer device. The message will be received by the top layer device and be routed through the edgeHubs of each device in the hierarchy until it reaches its designated device.

While sending direct methods to a lower layer device is a good way to directly control even disconnected devices, it requires the lower layer device to be connected to and discoverable by a parent device.

But what happens if a lower-layer device would go offline due to a networking failure of its parent?

What happens to the up – and downstream data flow on the lower levels?

How do the devices recover if the connection between the layers is separated?

In a nested hierarchy, each device has a dependency on its parent devices for connectivity. If the network connection is severed on a device in a given layer, then all devices that are residing in lower layers will also be unable to continue sending messages.

In our example, if the network connection of the layer 4 device is disconnected, both the layer 4 device and layer 3 device recognize this, and the normal store and forward functionality of the Edge Hub starts buffering. Layer 5 remains unaware and continues sending its messages to the IoT Hub. Once the network connections are re-established, each device automatically sends their queued messages in the original order. The time of how long a message should be stored is defined as the TTL (“Time To Live”) in the Module Identity twin for edgeHub of each device. The standard TTL setting is 7200 seconds or two hours.

"$edgeHub":{
	"properties.desired": {
		"schemaVersion": "1.0",
		"routes": {},
		"storeAndForwardConfiguration": {
			"timeToLiveSecs": 7200
		}
	}
}

If fine tuning is required the both the TTL and the priority of messages can also be defined in the routing rules for the edgeHub. The actual number of messages able to be stored on the device of course depends on the disk size of the device.

Device twin updates, cloud-to-device messages and direct method calls that are targeting a device that is temporarily offline, will also be queued in the IoT Hub whilst the edge is offline. They will be sent on, as long as the device reappears within the defined TTL period for C2D or the timeout for a direct method. This needs to considered in the solution design to avoid potential message loss or undesired operations occurring after they are still needed. The most robust way of communicating configuration, operational parameters and status remains the device or module twins.

See Operate devices offline – Azure IoT Edge | Microsoft Docs if you would like to learn more about offline behavior of IoT Edge devices.

Conclusion

We have walked through the practical application of nesting IoT Edge devices in industrial IoT and shown some best practice in managing the complexity in managing many heterogenous workloads on the edge. For more technical detail please visit the linked GitHub repository.

Want to try out and deploy a simulated purdue network model using IoT Edge? Look at this Github!

Stay informed. For the latest updates on new releases, tools, and resources, stay tuned to this blog and follow us @MSIoTDevs on Twitter.

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.

How to apply a hierarchy of edge devices at scale to ISA-95 IIoT environments

Getting your IIoT edge solution production ready

Certificates for a hierarchy of edge devices

Using layered deployments to roll out configurations at scale

Use the Module Twin or Direct Methods to send commands and desired configurations to lower-layer devices

How do the devices recover if the connection between the layers is separated?

Conclusion

Submit a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

We look forward to meeting you