This article is contributed. See the original author and article here.
What is the issue?
The customer asked SI (system integration) partner to migrate their system to cloud in the form of “lift-and-shift”, but session affinity did not work properly.
Environment and deployment topology
Their deployment topology is listed below. This issue occurred after migration completed. The customer does not configure their system on multi-region for availability.
- Azure Load Balancer (ALB) : Traffic routing is based on protocol and client IP.
- Network virtual appliance (NVA)
- L7 Load Balancer (L7 LB) : Active-Active configuration.
- Reverse Proxy (Apache HTTP server)
- Virtual Machine (VM)
- Packaged application
- Database (Oracle)
Additional requests from the customer are…
We’d like to configure cookie based session affinity.
We’d like to achieve it as inexpensive as possible.
In case packaged application is hosted on Java EE application server, session affinity is typically configured using clustering of Java EE application server or session sharing with in-memory data grid or cache. However, they cannot configure application server cluster since clustering is restricted against the edition they use. And the SI partner deployed L7 LB NVA behind ALB to achieve session affinity, as the SI partner knew ALB did not have session affinity feature.
Let’s imaging causes of this issue
There are many people who can imagine the root cause when looking at the deployment topology above. The following points should be checked.
- Would source IP of inbound traffic to ALB (public) change? Specifically, would global IP be changed when transforming local IP to global IP using SNAT on customer site?
- ALB does not have any feature for session affinity. Therefore, if source IP of inbound traffic is changed, the destination VM which hosts packaged application should change.
- Would reverse proxy develop side effect?
- Would L7 LB NVA which deploys behind ALB work as expected? Would session information be shared between both NVAs?
This issue occurred due to the following configuration.
- Source IP of inbound traffic was sometimes changed.
- When source IP was changed, ALB (public) recognized that this traffic came from another client and routed the traffic to another L7 LB NVA.
- L7 LB NVAs were deployed behind ALB for session affinity, but they did not work expectedly since session information was not shared with the NVAs. When inbound traffic was routed to one L7 LB NVA, the L7 LB NVA did not have any way to identify session continuity. So, the NVA recognized that this traffic came from other client.
The following URL describes traffic distribution rule.
Configure the distribution mode for Azure Load Balancer
The following table is listed what happened in each component specifically.
|Component||What would happens?|
|ALB (Public)||The fact of the matter is that traffic comes from the same client, but the traffic is sometimes NATed into other global IP. In this case, ALB (public) recognizes that this traffic comes from different client, and routes the traffic to any L7 LB NVAs. Therefore, chosen L7 LB NVA might be different from the one processed previous traffic from the same client.|
|L7 LB NVA||If L7 LB NVAs are configured in the form of “Active-Active” but session information is not shared between L7 LB NVAs, no L7 LB NVA can identify whether or not the traffic comes from the same client. Therefore, L7 LB NVA can route traffic to any reverse proxy NVAs and chosen reverse proxy NVA might be different from the one processed previous traffic.|
|ALB (Internal)||If a reverse proxy NVA where current traffic passed is different from the one processed previous traffic, ALB (Internal) recognizes that this traffic comes from different client since source IP is different, and routes the traffic to any internal L7 LB NVAs. Therefore, chosen L7 LB NVA might be different from the one processed previous traffic from the same client.|
|Internal L7 LB NVA||This is the same as mentioned above.
Since session information is not shared between internal L7 LB NVAs, no internal L7 LB NVA can identify whether or not the traffic comes from the same client. Therefore, internal L7 LB NVA can route traffic to any VMs hosted packaged application and chosen VM might be different from the one processed previous traffic.
|Packaged Application||Traffic routing was not consistent due to reasons mentioned above, so traffic was sometimes routed to the VM which handled previous traffic, and at other times another traffic was routed to the different VM.|
I commented points to be fixed and SI partner reconfigured component topology. After that, traffic was routed to an expected package application node.
- ALB, L7 LB NVAs, and Reverse Proxy NVAs were replaced with Azure Application Gateway (App GW).
- Cookie based affinity was enabled following the document.
Enable Cookie based affinity with an Application Gateway
Here is the reconfigured component topology. This topology helped the customer reduce NVA related cost and operational cost.
I did not recommend using Azure Front Door as a public L7 LB since the customer’s system was not multi-region supported and global service was useless.
What is Azure Front Door Service?
In this case, requirements for reverse proxy could cover App GW’s features. If App GW does not meet customer requirements for reverse proxy (for example, reverse proxy for authentication gateway is required), the following topology would be better.
The following points are important when migrating existing systems to cloud.
- Good understanding of services you are using.
- Simple deployment topology. In other words, decrease the number of components you use.
Hope this helps.
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.