This article is contributed. See the original author and article here.

ADF has connectors for Parquet, Avro, and ORC data lake file formats. However, datasets used by Copy Activity do not currently have support for those types. Here is how to read and write those complex columns in ADF by using data flows.


 


There is a description of this technique in each file format documentation page in the ADF online docs:


 


https://docs.microsoft.com/en-us/azure/data-factory/format-orc#dataset-properties


https://docs.microsoft.com/en-us/azure/data-factory/format-parquet#data-type-support


https://docs.microsoft.com/en-us/azure/data-factory/format-avro#data-flows 


 


Step 1: Make a new dataset and choose the file format type. In this example, I am using Parquet. Set NONE for schema:


complex1.png


Step 2: Make a data flow with this new dataset as the source:


complex2.png


Step 3: Go to Projection -> Import Projection


complex4.png


Step 4: You’ll see your data under Data Preview


complex3.png

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.