This article is contributed. See the original author and article here.
Error: File ‘https://storageaccount.dfs.core.windows.net/parquettest/filename.snappy.parquet‘ cannot be opened because it does not exist or it is used by another process.
Here it goes:
- First check if your account is Data Lake Storage Gen2. This is defined during the creation of the storage account. Fig. 1:
Fig. 1 Data Lake
- I am using for this test Selected NetworkAllow Trusted Microsoft services. But It should work with All Networks as well. Fig. 2.
Fig. 2 Firewall
- Add the RBAC permission to your workspace. Fig. 3.
Fig. 3 RBAC
- Open Synapse Studio
- Open Data option for SQL on Demand
- Create a database
- Point to the database created.
- Create the MSI which requires create a master key
- Run OpenRowset using that credential as data source.
Follow the script:
CREATE DATABASE [YourDatabase]
GO
——-Change to YourDatabase
CREATE MASTER KEY ENCRYPTION BY PASSWORD= ‘XXXXX!0000’;
— Create a database scoped credential.
CREATE DATABASE SCOPED CREDENTIAL AppCred WITH IDENTITY = ‘Managed Identity’
Note my Storage Account is called Criativity, you should use for the MSI .dfs instead of .blob.
CREATE EXTERNAL DATA SOURCE YourDataSourceName_criativity
WITH (
LOCATION = ‘https://criativity.dfs.core.windows.net‘,
cREDENTIAL = [AppCred]
)
SELECT TOP 100 *
FROM OPENROWSET(
BULK ‘parquet/*.parquet’,
DATA_SOURCE = ‘YourDataSourceName_criativity’,
FORMAT = ‘PARQUET’
) as X
That is it!
Thanks to the collegues Charl Roux and Veljko Vasic. :)
Liliam Leme
UK Engineer
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.
Recent Comments