This article is contributed. See the original author and article here.

Error: File https://storageaccount.dfs.core.windows.net/parquettest/filename.snappy.parquet cannot be opened because it does not exist or it is used by another process.

 

Here it goes:

 

  1. First check if your account is Data Lake Storage Gen2. This is defined during the creation of the storage account. Fig. 1:

datalake.png

Fig. 1 Data Lake

 

  1. I am using for this test Selected NetworkAllow Trusted Microsoft services. But It should work with All Networks as well. Fig. 2.firewall.pngFig. 2  Firewall
  2. Add the RBAC permission to your workspace. Fig. 3.workspace_permission.pngFig. 3  RBAC
  3. Open Synapse Studio
    1. Open Data option for SQL on Demand
    2. Create a database
    3. Point to the database created.
    4. Create the MSI which requires create a master key
    5. Run OpenRowset using that credential as data source.

 

Follow the script:

 

CREATE DATABASE [YourDatabase]

GO

——-Change to YourDatabase

CREATE MASTER KEY ENCRYPTION BY PASSWORD‘XXXXX!0000’;

— Create a database scoped credential.

CREATE DATABASE SCOPED CREDENTIAL AppCred WITH IDENTITY = ‘Managed Identity’

 

script.png

 

 

Note my Storage Account is called Criativity, you should use for the MSI .dfs instead of .blob.

 

 

CREATE EXTERNAL DATA SOURCE YourDataSourceName_criativity

WITH (

LOCATION = https://criativity.dfs.core.windows.net,

cREDENTIAL = [AppCred]

)

SELECT  TOP 100 *

FROM OPENROWSET(

        BULK ‘parquet/*.parquet’,

        DATA_SOURCE = ‘YourDataSourceName_criativity’,

        FORMAT = ‘PARQUET’

as X

 

That is it!

 

Thanks to the collegues Charl Roux and Veljko Vasic. :)

 

Liliam Leme

UK Engineer

 

 

 

 

 

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.