Using MSI to authenticate on a Synapse Spark Notebook while querying the Storage

Using MSI to authenticate on a Synapse Spark Notebook while querying the Storage

This article is contributed. See the original author and article here.

Scenario: The customer wants to configure the notebook to run without using the AAD configuration. Just using MSI.


Synapse uses Azure Active Directory (AAD) passthrough by default for authentication between resources.

 


As documented here: https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-secure-credentials-with-tokenlibrary?pivots=programming-language-scala


_When the linked service authentication method is set to Managed Identity or Service Principal, the linked service will use the Managed Identity or Service Principal token with the LinkedServiceBasedTokenProvider provider._


 


The purpose of this post is to help step by step how to do this configuration:


 



Requisites:


  • Synapse ( literally the workspace) MSI  must have the RBAC – Storage Blob Data Contributor permission on the Storage Account. That is also the prerequisite documented

  • However, I worked with a customer that setup ACL -> Read and execute permission on the Storage Account <I also tested and it works>

  • It should work with or without the firewall on the storage. I mean firewall enable is not mandatory.

  • However, If you by security reasons enabled the firewall on the storage be sure of the following:



grant_storage.png

 

ACL

ACLs.png

 

Step 1:

 


Open Synapse Studio and configure the Linked Server to this storage account using MSI:

linkedserver.png

 



Step 2:


Using config set point the notebook to the linked server as documented:

val linked_service_name = "LinkedServerName" 
// replace with your linked service name


// Allow SPARK to access from Blob remotely
val sc = spark.sparkContext
spark.conf.set("spark.storage.synapse.linkedServiceName", linked_service_name)
spark.conf.set("fs.azure.account.oauth.provider.type", "com.microsoft.azure.synapse.tokenlibrary.LinkedServiceBasedTokenProvider") 
//replace the container and storage account names
val df = "abfss://Container@StorageAccount.dfs.core.windows.net/"

print("Remote blob path: " + df)

mssparkutils.fs.ls(df)

 


 


In my example, I am using mssparkutils to list the container.


 


 


You can read more about mssparkutils here: Introduction to Microsoft Spark utilities – Azure Synapse Analytics | Microsoft Docs


 


 


Additionally:


 


This link will cover details about ADF, which is not the focus of this post. But, in terms of MSI it covers relevant permissions:


Copy and transform data in Azure Blob storage – Azure Data Factory | Microsoft Docs


Grant the managed identity permission in Azure Blob storage. For more information on the roles, see Use the Azure portal to assign an Azure role for access to blob and queue data.



  • As source, in Access control (IAM), grant at least the Storage Blob Data Reader role.

  • As sink, in Access control (IAM), grant at least the Storage Blob Data Contributor role.


 


That is it!


Liliam UK Engineer


 



Reconnect Series: Steve Banks

Reconnect Series: Steve Banks

This article is contributed. See the original author and article here.

Welcome back to Reconnect, the biweekly series that catches up with former MVPs and their current activities.


 


This week we are thrilled to be joined by 14-time titleholder Steve Banks! Hailing from the Seattle area, Washington, Steve is President of Banks Consulting Northwest. 


 


The business focuses on servicing the information technology needs of small to medium businesses in the greater Puget Sound region of Washington State. Moreover, Banks Consulting Northwest has participated extensively in Microsoft’s Technology Adoption Program, helping to gather feedback and real-world user experiences of Microsoft solutions in the small business space.


 


Steve has collaborated with Microsoft, Forbes, Hewlett-Packard, Trend Micro, and others on white papers and case studies. Further, he has been awarded the title of MVP 14 times from 2004 (Windows Server and Cloud & Datacenter Management) and holds Microsoft Certifications in Windows Client and Server products, including Small Business Server.


 


When he’s not hard at work, Steve plays a vital role in his community. He founded the Puget Sound Small Business Server User Group and likes to keep up to date with all things happening in the Microsoft space.


 


For example, he has contributed to multiple exams and coursework with Microsoft Learning, co-authored books on Small Business Server, and participated in numerous conferences and workshops related to Microsoft Server products and IT consulting.


 


For more on Steve, check out his Twitter @stevenabanks.


 


steve.jpg

Azure Unblogged – Event Hub on Azure Stack Hub

This article is contributed. See the original author and article here.

In this latest episode of Azure Unblogged, I am chatting to Manoj Prasad from the Azure Event Hubs team, to cover how you can leverage Event Hubs on your Azure Stack Hub in your Hybrid Cloud environment.


 


Event Hubs on Azure Stack Hub will allow you to realize cloud and on-premises scenarios that use streaming architectures. You can use the same features of Event Hubs on Azure, such as Kafka protocol support, a rich set of client SDKs, and the same Azure operational model. Whether you are using Event Hubs on a hybrid (connected) scenario or on a disconnected scenario, you will be able to build solutions that support stream processing at a large scale that is only bound by the Event Hubs cluster size that you provision according to your needs.


 


You can watch the video here or on Channel 9


 



 


 


Learn more:


Experiencing Data Access Issue in Azure portal for Log Analytics – 04/21 – Investigating

This article is contributed. See the original author and article here.

Initial Update: Wednesday, 21 April 2021 08:14 UTC

We are aware of issues within Log Analytics and are actively investigating. Some customers may experience data access issue and delayed or missed Log Search Alerts in West Europe region.
  • Work Around: None
  • Next Update: Before 04/21 11:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Vyom

Experiencing Data Access issue in Azure Portal for Many Data Types – 04/21 – Resolved

This article is contributed. See the original author and article here.

Final Update: Wednesday, 21 April 2021 02:14 UTC

We’ve confirmed that all systems are back to normal with no customer impact as of 04/21, 01:31 UTC. Our logs show the incident started on 04/21, 01:15 UTC and that during the 16 minutes that it took to resolve the issue some of the customers might have experienced issues accessing data and missed or delayed alerts in West US2 region.
  • Root Cause: The failure was due to an issue in one of our backend services.
  • Incident Timeline: 16 minutes – 04/21, 01:15 UTC through 04/21, 01:31 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Saika