This article is contributed. See the original author and article here.

Azure Databricks supports Azure Active Directory (AAD) tokens (GA) to authenticate to REST API 2.0. The AAD tokens support enables us to provide a more secure authentication mechanism leveraging Azure Data Factory’s System-assigned Managed Identity while integrating with Azure Databricks.


 


Benefits of using Managed identity authentication:



  • Managed identities eliminate the need for data engineers having to manage credentials by providing an identity for the Azure resource in Azure AD and using it to obtain Azure Active Directory (Azure AD) tokens. In our case, Data Factory obtains the tokens using it’s Managed Identity and accesses the Databricks REST APIs.  

  • It lets you provide fine-grained access control to particular Data Factory instances using Azure AD. 

  • It helps prevent usage of Databricks Personal Access Tokens, which acts as a password and needs to be treated with care, adding additional responsibility on data engineers on securing it.


Earlier, you could access the Databricks Personal Access Token through Key-Vault using Manage Identity. Now, you can directly use Managed Identity in Databricks Linked Service, hence completely removing the usage of Personal Access Tokens. 


 


High-level steps on getting started:



  1. Grant the Data Factory instance ‘Contributor’ permissions in Azure Databricks Access Control.
    databricks-grant-access-to-adf-msi-1.jpg databricks-grant-access-to-adf-msi-2.jpg

  2. Create a new ‘Azure Databricks’ linked service in Data Factory UI, select the databricks workspace (in step 1) and select ‘Managed service identity’ under authentication type.
    databricks-grant-access-to-adf-msi-3.jpg

     



Spoiler (Highlight to read)

Note: Please toggle between the cluster types if you do not see any dropdowns being populated under ‘workspace id’, even after you have successfully granted the permissions (Step 1). 
Note: Please toggle between the cluster types if you do not see any dropdowns being populated under ‘workspace id’, even after you have successfully granted the permissions (Step 1). 

Sample Linked Service payload:


 


 


 


 


 

{
    "name": "AzureDatabricks_ls",
    "type": "Microsoft.DataFactory/factories/linkedservices",
    "properties": {
        "annotations": [],
        "type": "AzureDatabricks",
        "typeProperties": {
            "domain": "https://adb-***.*.azuredatabricks.net",
            "authentication": "MSI",
            "workspaceResourceId": "/subscriptions/******-3ab0-48f2-b171-0f50ec******/resourceGroups/work-rg/providers/Microsoft.Databricks/workspaces/databricks-****",
            "existingClusterId": "****-030259-dent495"
        }
    }
}

 


 


 


 

Spoiler (Highlight to read)

Note: There are no secrets or personal access tokens in the linked service definitions!
Note: There are no secrets or personal access tokens in the linked service definitions!

 

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.

%d bloggers like this: