This article is contributed. See the original author and article here.

Continuous integration and continuous delivery (CI/CD) culture started to get popular, and it brought the challenge of having everything automatized, aiming to make processes easier and maintainable for everyone.


 


One of the most valuable aspects of CI/CD is the integration of the Infrastructure as Code (IaC) concept, with IaC we can version our infrastructure, save money, creating new environments in minutes, among many more benefits. I won’t go deeper about IaC, but if you want to learn further visit: The benefits of Infrastructure as Code 


 


IaC can also bring some challenges when creating resources needed for the projects. This is mostly due to creating all the scripts for the infrastructure is a task that is usually assigned to the infrastructure engineers, and it happens that we can’t have the opportunity to be helped for any reason.


 


As a Data Engineer, I would like to help you understand the CI/CD process with a hands-on. You’ll learn how to create Azure Databricks through Terraform and Azure DevOps, whether you are creating projects by yourself or supporting your Infrastructure Team.


 


In this article, you´ll learn how to integrate Azure Databricks with Terraform and Azure DevOps and the main reason is just because in this moment I’ve had some difficulties getting the information with these 3 technologies together.


 


First of all, you’ll need some prerequisites 


 



  • Azure Subscription

  • Azure Resource Group (you can use an existing one)

  • Azure DevOps account

  • Azure Storage Account with a container named “tfstate”

  • Visual Studio Code (it’s up to you)


So, let’s start and have some fun


 


Please, go ahead and download or clone this GitHub repository  databrick-tf-ado and get demo-start branch.


In the folder you’ll see a file named main.tf and 2 more files in the folder modules/databricks-workspace


 


Vanessa_Segovia_0-1651505246300.png


 


It should be noted that this example is a basic one, so you can find more information of all the features for databricks in this link: https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs 


 


Now, go to the main.tf file in the root folder and find line 8 where the declaration of azurerm starts


 


 

  backend "azurerm" {
    resource_group_name  = "demodb-rg"
    storage_account_name = "demodbtfstate"
    container_name       = "tfstate"
    key                  = "dev.terraform.tfstate"
  }

 


 


there you need to change the value of resource_group_name and storage_account_name for the values of you subscription, you can find those values in your Azure Portal, they need to be already created.


 


storageaccount.png


 


 


In main.tf file inside root folder there’s a reference to a module called “databricks-workspace”, now in that folder you can see 2 more files main.tf and variables.tf. 


 


main.tf contains the definition to create a databricks workspace, a cluster, a scope, a secret and a notebook, in the format that terraform requires and variables.tf contains the information of the values that could change depending on the environment. 


 


Now that you changed the values mentioned above into a GitHub or DevOps repository if you need assistance for that visit these pages: GitHub or DevOps.


 


At this moment we have our github or devops repository with the names that we require configured, so let´s create our pipeline to deploy our databricks environment into our Azure subscription.


 


First go to your azure subscription and check that you don’t have a databricks called demodb-workspace


 


portalazurebefore.png


 


 


You’ll need to install an extension so DevOps can use terraform commands so go to Terraform Extension.


 


Once is installed in your project in Azure DevOps click on Pipelines-Release and Create “new pipeline”, it appears the option by creating the pipeline with YAML or with the Editor, I’ll choose the Editor so we can see it clearer.


 


Vanessa_Segovia_3-1651505246308.png


 


 


In Add an Artifact in the Artifact section of the pipeline select your source type (provider where you uploaded your repository) and fill all the required information, like the image below and click “Add”


 


addartifact.png


 


 


Then click on Add stage in Stages section and choose empty Job and name the stage as “DEV”


 


addstage.png


 


After that click on Jobs below the name of the stage


Vanessa_Segovia_6-1651505246314.png


 


In the Agent job, press the “+” button and search for “terraform” select “Terraform tool installer”


 


addinstallterraform.png


Leave the default information


 


Then Add another 3 tasks of “Terraform” task


 


addterraformtask.png


 


Name the second task after Installer as “Init” and fill the information required like the image:


 


init.png


 


 


For all these 3 tasks set the information of your subscription, resource group, storage account and container, and there’s also a value labeled key, there you have to set “dev.terraform.tfstate” is a key that terraform uses to keep tracking of your Infrastructure changes.


 


suscription.png


 


Name next task as “Plan”


 


plan.png


 


Next task “Apply”


 


apply.png


 


Now change the name of your pipeline and save it


 


namepipeline.png


 


And we only need to create a Release to test it


 


You can monitor the progress


 


progress.png


 


 


When it finished, if everything was good you’ll see your pipeline as successful 


 


success.png


 


Lastly let´s confirm in the azure portal that everything is created correctly


 


finalportal.png


 


then login in your workspace and check the and run the notebook, so you can test that the cluster, the scope, the secret and the notebook are working correctly.


 


workspace.png


 


 


With that you can easily maintain your environments safe from the changes that contributors can do, only one way to accept modifications into your infrastructure.


 


Let us know any comments or questions.


 


 


 


 


 


 


 


 

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.