This article is contributed. See the original author and article here.
One of the biggest challenges businesses face is how to integrate disparate data sources from many different sources, and how to turn valuable data into actionable insights. Big Data Clusters (BDC) is on the right choice for Big Data Analytics solutions.
As a cloud-native, platform-agnostic, open data platform for analytics at any scale orchestrated by Kubernetes, BDC works on Azure Kubernetes Service ( AKS ) – a fully managed Kubernetes service in Microsoft Azure cloud platform.
For security-critic customers who need a private environment, deploying BDC with AKS private cluster is a good way to restrict use of public IP addresses. Furthermore you can use UDR ( user-defined routes) to restrict egress traffic. You can do this with automation scripts are available on SQL Sample Github repo – private-aks.
Deploy AKS private cluster with automation scripts
Go to the Github repo to deploy AKS private cluster from here with your client in Linux OS or using WSL/WSL2. There are two bash scripts of you can use to deploy AKS private cluster:
You can use deploy-private-aks.sh to provision a private AKS cluster with private endpoint, and fto limitthe use of public addresses as well as egress traffic, use deploy-private-aks-udr.sh to deploy BDC with AKS private cluster and limit egress traffic with UDR ( User-defined Routes ).
Here we take more common case where a you deploy BDC with AKS private cluster. After downloading the script on the client environment, you can use the following command to execute the script :
chmod +x deploy-private-aks.sh sudo ./deploy-private-aks.sh
Input your Azure subscription ID, the resource group name, and the Azure region that you wish to deploy your resource:
The deployment will take a few minutes. You’ll be able to find the deployed resources on your Azure portal after the deployment completes.
Access to AKS private cluster
After you deploy a private AKS cluster, you need to access a VM to connect to AKS cluster. There are multiple ways to help you manage your AKS private cluster, and you can find those at this link. Here we’re using the easiest option, which is to provision a management VM which installs all required SQL Server 2019 big data tools and resides on the same VNET with your AKS private cluster, then connect to that VM so you can get access to private AKS cluster as follows :
Deploy BDC with AKS private cluster with automation script
You can download the script deploy-bdc.sh to deploy BDC without a public endpoint:
chmod +x deploy-bdc.sh sudo ./deploy-bdc.sh
This requires you to set up the BDC admin username and password, and then it kicks off a BDC cluster deployment:
At the end of the deployment, the script will list all the BDC endpoints :
Connect to BDC in AKS private cluster
Make sure all components of your BDC cluster show a healthy status :
azdata bdc status show
If all goes well, you’ll get this output:
You can use the SQL Server master instance in the cluster endpoint to connect to BDC cluster with SQL Server Management Studio or Azure Data Studio as shown here :
As we saw in the first part of this article, businesses are looking for a secure, portable way to create value from multiple sources of data. Using SQL Server’s Big Data Cluster ( BDC ) in an Azure Kubernetes Service ( AKS ) private cluster, they get exactly that. You’ve seen how to use two variations of scripts that are available on our repository to fit your network environment and security requirements. You can also customize the scripts with your specific requirements for the information such as IP addresses range, flags to add or remove an AKS feature while creating AKS cluster before deploying in your environment.
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.