This article is contributed. See the original author and article here.

The purpose of this about is to discuss Managed and External tables while querying from SQL On-demand or Serverless.

Thanks to my colleague Dibakar Dharchoudhury for the really nice discussion related to this subject.


By the docs: Shared metadata tables – Azure Synapse Analytics | Microsoft Docs


  • Managed tables

Spark provides many options for how to store data in managed tables, such as TEXT, CSV, JSON, JDBC, PARQUET, ORC, HIVE, DELTA, and LIBSVM. These files are normally stored in the warehouse directory where managed table data is stored.

  • External tables

Spark also provides ways to create external tables over existing data, either by providing the LOCATION option or using the Hive format. Such external tables can be over a variety of data formats, including Parquet.

Azure Synapse currently only shares managed and external Spark tables that store their data in Parquet format with the SQL engines


NoteThe Spark created, managed, and external tables are also made available as external tables with the same name in the corresponding synchronized database in serverless SQL pool.”


Following an example of an External Table created on Spark-based in a parquet file:


1) Authentication:

blob_account_name = "StorageAccount"
blob_container_name = "ContainerName"
from pyspark.sql import SparkSession

sc = SparkSession.builder.getOrCreate()
token_library =
blob_sas_token = token_library.getConnectionString("LInkedServerName")

    '' % (blob_container_name, blob_account_name),


Note my linked Server Configuration:



2) External table:



filepath ='wasbs://'
df =, format='parquet')

 Here you can query from SQL Serverless

Query_from SQL.png


If you check the path where your external table was created you will be able to see under the Data lake as follows.  For example, my workspace name is synapseworkspace12:




3) I can also create a managed table as parquet using the same dataset that I used for the external one as follows:


#Managed - table

This one will also be persisted on the storage account under the same path but on the managed table folder.


Following the documentation. This is another way to achieve the same result for managed table, however in this case the table will be empty:


CREATE TABLE SeverlessDB.myparquettable(id int, name string, birthdate date) USING Parquet



Those are the commands supported to create managed and external tables on Spark per doc. that would be possible to query on SQL Serverless.


 If you want to clean up this lab – Spark SQL:


-- Drop the database and it's tables



That is it!



UK Engineer

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.

%d bloggers like this: