This article is contributed. See the original author and article here.

Azure Synapse Analytics Workspace enables you to read the files from Azure Data Lake storage using OPENROWSET(BULK <file url>). In this article you will see how to grant minimal permission to the users who need to analyze files with OPENROWSET(BULK) function.

 

When you try Azure Synapse workspace, you usually start with full permissions (for example as database owner). Once you need to let other users access the files on Data Lake, you probably think that is the minimal permission that you need to assign to user to let them just do ad-hoc analysis without any other permissions.

In this post you will learn how to configure minimal permission to needed to analyze the files on Azure storage.

Creating server principals

SQL endpoint in Synapse workspace uses standard T-SQL syntax to create principals who can access your data. The following statement creates a new login with username testprincipal and password VeryStrongAndSecurePassword1234!!!:

 

create login testprincipal
with password = 'VeryStrongAndSecurePassword1234!!!';

 

Using this T-SQL statement you can create new principals that are assigned to the public role.

Let’s see what we can do with this principal. If we login using this username/password we can try to run the following queries:

 

 

 

select name from  sys.databases
-- success 

use SampleDB
--Msg 916, Level 14, State 2, Line 1
--The server principal "testprincipal" is not able to access the database "SampleDB" under the current security context.

select top 10  *
from openrowset(bulk 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.parquet',
                format='parquet') as a
-- Success

 

By default, a new user can see all databases, execute OPENROWSET to query files on Azure Data Lake storage, but cannot access other databases or create objects. This can be verified using the following function:

JovanPop_0-1593173983885.png

New login  is in the public role and has the permissions to view any database and to run ad-hoc query using OPENROWSET, but no other permissions.

This version of OPENROWSET enables principals to public access storage or in case of Azure AD principal to access files on the storage where storage admin granted Storage Blob Data Reader RBAC role to Azure AD user who access. If you need more fine grained impersonation mechanism, you should create data sources and protect them with credentials in databases. the following section will describe how to enable principal access objects in databases.

Configuring database permissions

Let’s enable user to access database. In the context of some database such as SampleDB execute the following statement:

 

create user testprincipal for login testprincipal;

 

 

 

This statement created a database user that will access current database using the login defined in the previous script. This principal can now connect to SampleDB database, can still use OPENROWSET with absolute URL to read files from storage, and it can also use OPENROWSET that references some public data source:

 

SELECT TOP 10 *
FROM
    OPENROWSET(
        BULK 'puYear=*/puMonth=*/*.parquet',
-- this is a data source that referneces a public location (without credential)
        DATA_SOURCE = 'YellowTaxi', 
        FORMAT='PARQUET'
    ) AS nyc

 

In this case, data source doesn’t have credential and references public location:

 

CREATE EXTERNAL DATA SOURCE YellowTaxi
WITH ( LOCATION = 'https://azureopendatastorage.blob.core.windows.net/nyctlc/yellow/')

 

However, this principal cannot read some files using a pre-defined data source that is protected with credential:

JovanPop_1-1593173983896.png

The issue here is that data source uses credential to access storage, and the current user cannot reference this credential:

 

CREATE EXTERNAL DATA SOURCE SqlOnDemandDemo WITH (
    LOCATION = 'https://sqlondemandstorage.blob.core.windows.net',
    CREDENTIAL = sqlondemand
);

 

In order to access the files via crednetial-protected data source, the principal needs to have references permission on the underlying database scoped credential (in this case sqlondemand credential) that is used in data source:

 

grant references 
    on database scoped credential::sqlondemand
    to testprincipal;

 

Note that the user who uses OPENROWSET with data source still need to have ADMINISTER BULK OPERATIONS permission. If you deny this permission, the principal cannot use OPENROWSET anymore:

 

-- execute from master database context:
deny ADMINISTER BULK OPERATIONS to testprincipal;

 

Note that you need to explicitly DENY permissions and you cannot just REVOKE this permission like in this example:

 

revoke ADMINISTER BULK OPERATIONS to testprincipal;

 

If you try to revoke this permission testprincipal will still be able to execute OPENROWSET (you can confirm this using fn_my_permissions(NULL, ‘SERVER’) function)

This might confuse you, but the reason is that ADMINISTER BULK OPERATIONS is not initially granted to testprincipal. testprincipal belongs to the public role, and in synapse SQL endpoint, public role has ADMINISTER BULK OPERATIONS permission by default. If you want to revoke this permission, you would need to revoke it from the public role using this statement:

 

revoke ADMINISTER BULK OPERATIONS to public;

 

If you will have just the users who will analyze files in ad-hoc manner, you can leave this permission to public role. Otherwise it might be good to revoke it and create separate data analyst role that will have this permission.

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.