September 2021 - Page 5 of 18 - Dr. Ware Technology Services

Optimize ROI by migrating to Azure Database for MySQL

by Contributed | Sep 23, 2021 | Technology

This article is contributed. See the original author and article here.

Microsoft commissioned the Enterprise Strategy Group (ESG) to conduct an economic value study to examine the potential cost savings and business benefits that enterprises can achieve from migrating on-premises workloads to Azure Database for MySQL. ESG is an independent strategy firm that provides market intelligence and actionable insights to the global IT community. Aviv Kaufmann, Senior Economic Validation Analyst and author of the report, completed a quantitative economic analysis of Azure Database for MySQL and determined that customers reported savings and benefits in the three categories: 1) elimination of costs, 2) operational savings, and 3) improved business agility.

Note: The primary source for this blog is ESG’s report, The Economic Value of Migrating and Modernizing On-premises Instances to Azure (September 2, 2021).

Streamline application development on Azure

ESG conducted a survey to evaluate businesses cloud infrastructure services spending trends and found that as of January 2021, 68% of businesses planned to spend more on cloud services as we continue into the second straight year of a pandemic remote – work environment. More specifically, ESG’s survey found that 85% of cloud users have accelerated efforts to move applications to the cloud based on recent events over the last six months. Two of the top motivating factors for organizations to digitally transform their businesses are to become more operationally efficient and to embrace the ever-evolving technology trends that are allowing users to collaborate in new ways. (ESG Research Report: 2021 Technology Spending Intentions Survey, January 19, 2021)

Managed services are one of the many options that businesses are choosing to become operationally efficient. Migrating and modernizing to the cloud is a cost-effective way to streamline application development and improve business agility. MySQL is one of the most popular open source relational databases that organizations choose to power their web applications. ESG’s analysis noted that:

“While many developers have built the expertise required to deploy, manage, and maintain MySQL databases to power their applications, the burden of managing, updating, tuning, protecting, and ensuring high availability for MySQL instances across multiple applications takes valuable cycles away from development of core applications.”

ESG’s economic validation study focuses on the quantitative and qualitative benefits that organizations can expect when migrating on-premises MySQL instances to the fully managed Azure Database for MySQL service.

Azure Database for MySQL eliminates cost, provides operational savings, and improves business agility

ESG interviewed three Azure Database for MySQL customers and found that they were able to significantly eliminate costs related to hardware, infrastructure, and hypervisors, and reduce their MySQL commercial subscriptions. These customers reported that the biggest benefits they’ve seen are from the operational savings achieved by DevOps teams leveraging Azure Database for MySQL. One customer noted that, “Now that we migrated to Microsoft Azure, I don’t need to worry about all the IT operational aspects, like backups and updates, and that’s a huge benefit for my team”.

ESG found that customers were excited to report the different ways that Azure Database for MySQL helped them transform their businesses by allowing them to achieve greater business agility in everyday practice. Some of the ways that DevOps and operations teams have been able to work more efficiently include providing faster time to value, accelerating migration and modernization, improving application performance, increasing flexibility and business agility, integrating Azure Services more quickly, and increasing development velocity.

ESG’s findings

As part of their analysis, ESG leveraged customer interviews, product pricing models, and public and industry knowledge of economics and technologies to create a three-year TCO/ROI model that compares the costs and benefits of migrating on-premises MySQL instances to Azure Database for MySQL. ESG’s model found that Azure Database for MySQL provided up to a 48% lower total cost over a three-year period, including a 93% reduction in the operational cost to deploy, manage, and maintain MySQL instances in an on-premises infrastructure. ESG reports that Azure Database for MySQL offloads a bulk of the work currently performed by developers, which further reduces the cost to administer, tune, secure, protect, and maintain the MySQL databases by up to 86%. This savings equates roughly to the productivity and development capacity of four full-time developers, creating an additional $15.5 million in revenue for the business.

ESG finds that as organizations continue to migrate and modernize to the cloud, they can expect to streamline operations while reducing operational costs and improving business agility with Azure Database for MySQL. Businesses are looking to Microsoft Azure to help digitally transform their applications with more efficient microservice architectures that can be dynamically hosted, scaled, protected, secured, and delivered by the many services provided by the Azure ecosystem. ESG recommends the fully managed Azure Database for MySQL service as a cost-effective solution for successfully increasing business agility, optimizing performance, and maximizing the productivity of developers and IT resources.

For additional detail, refer to ESG’s full report, The Economic Value of Migrating and Modernizing On-premises Instances to Azure Database for MySQL.

If you’re ready to learn more, explore these resources:

Learn more about Azure Database for MySQL.

Read the documentation.

Watch the on-demand webinar (available 10/1/21).

Cisco Releases Security Updates for Multiple Products

by Scott Muniz | Sep 23, 2021 | Security, Technology

This article is contributed. See the original author and article here.

Cisco has released security updates to address vulnerabilities in multiple Cisco products. An attacker could exploit some of these vulnerabilities to take control of an affected system.

CISA encourages users and administrators to review the Cisco Security Advisories page and apply the necessary updates.

CISA Releases Guidance: IPv6 Considerations for TIC 3.0

by Scott Muniz | Sep 23, 2021 | Security, Technology

This article is contributed. See the original author and article here.

The federal government has prioritized the transition of federal networks to Internet Protocol version 6 (IPv6) since the release of Office of Management and Budget (OMB) Memorandum 05-22 in 2005. In 2020, OMB renewed its focus on IPv6 through the publication of OMB Memorandum 21-07. That memorandum specifically entrusts CISA with enhancing the Trusted Internet Connections (TIC) program to fully support the implementation of IPv6 in federal IT systems.

In accordance with this OMB mandate, CISA has issued IPv6 Considerations for TIC 3.0 to provide federal agencies with guidance to help them use IPv6 to secure their networks by:

Providing IPv6 protocol information to enable a general understanding,
Informing agencies of their responsibilities concerning OMB M-21-07,
Aligning TIC 3.0 security objectives and security capabilities with IPv6, and
Offering awareness and guidance regarding IPv6 security considerations.

CISA encourages IT decision-makers and administrators in all federal government agencies and organizations to review IPv6 Considerations for TIC 3.0 to facilitate advancing IPv6 networks and ensuring future growth and innovation in internet services and technology.

CISA, FBI, and NSA Release Joint Cybersecurity Advisory on Conti Ransomware 

by Scott Muniz | Sep 22, 2021 | Security, Technology

This article is contributed. See the original author and article here.

CISA, the Federal Bureau of Investigation (FBI), and the National Security Agency (NSA) have released a joint Cybersecurity Advisory (CSA) alerting organizations of increased Conti ransomware attacks. Malicious cyber actors use Conti ransomware to steal sensitive files from domestic and international organizations, encrypt the targeted organizations’ servers and workstations, and demand a ransom payment from the victims.

CISA, FBI, and NSA encourage network defenders to examine their current cybersecurity posture and apply the recommended mitigations in the joint CSA, which include:

Updating your operating system and software,
Requiring multi-factor authentication, and
Implementing network segmentation.

Additionally, review the U.S. government resource StopRansomware.gov for more guidance on ransomware protection, detection, and response.

Azure Synapse (Pipelines) for Social Media – YouTube example

by Contributed | Sep 22, 2021 | Technology

This article is contributed. See the original author and article here.

Introduction

This article will show you how to use Azure Synapse (also works with Azure Data Factory) to build a pipeline that gets public data from YouTube using REST API’s and store it in parquet files in ADLS Gen2 and Azure Synapse Analytics. At the end, you should be able to connect to other social media platforms that support REST API’s.

In addition, we will have a look at different ways to analyze the data in Azure Synapse Analytics: using SQL Serverless, SQL Provisioned (SQL Pools) and Spark Clusters.

Considerations
The YouTube REST API calls used in this exercise use API Keys as credentials. Using OAuth 2.0 is not possible because we’re using an automated pipeline that runs without a UI, required for user interaction and sign-in. Managing these credentials is not covered here – please refer to each social media platform API official documentation.
The pipeline stores data in both ADLS Gen2 parquet files and Azure Synapse Analytics SQL Pool tables – this may or may not be a good implementation option. This decision is taken only to be able to demonstrate the two possibilities, between many – obviously, you should consider your own requirements.

What’s needed
Azure Key Vault – to store the API Key; optional but strongly advised.
Azure Synapse Analytics workspace – to build the pipeline (optionally Azure Data Factory can be used)
SQL Serverless – no need to be provision.
SQL Pool – as a provisioned SQL engine; one table and one procedure will be created here.
Spark Cluster – as a provisioned Spark engine to run spark notebooks.
YouTube API Key – to access the YouTube Rest API

1. Preparation

Azure key vault secret
Create a new Azure key vault, or use an existing one, and create a new secret using:
Name: YouTubeMicrosoftDemosProjectAPIKey
Value: <YOUR-YOUTUBE-API-KEY>

Screenshots of getting the API Key value and creating the new secret:

Note: you need to grant the Azure Synapse Analytics workspace Managed Identity access to your Azure Key vault. Learn more here.

In this example, only “Get” and “List” are selected on the “Secret Permissions” list.
Next, open the current version of the secret and store the “Secret Identifier” as it will be used later:

Linked service to the Storage Account
Because we will be using the “Primary” Azure Data Lake Storage Gen2 storage account for the Azure Synapse Analytics workspace, specified when creating the workspace, the linked service is already created, and its name should be in this format: “<WORKSPACE-NAME>-WorkspaceDefaultStorage”.
Note: You need to create a new linked service if you want to use a storage account other than the “Primary” or if you’re using Azure Data Factory.

Linked service to the SQL Pool
Again, there’s no need to create a linked service to a SQL Pool in the same workspace because we will be using an activity that is unique to Azure Synapse Integrate pipelines: the “SQL pool stored procedure”.
Note: If you are using Azure Data Factory, then you need to create a linked service to the SQL Pool where you want to store your data. Refer to the “ADF linked service to Azure Synapse Analytics” section in the Azure Synapse SQL Pools Auto DR article.

Create YouTube video statistics table and related stored procedure
Create a new table in the SQL Pool you want to store the data. It will be used to store some fields from the REST API responses and well as the entire payload.

CREATE TABLE [dbo].[youtube_video_likes] (
  ITEM_ID            VARCHAR(100) NOT NULL,
  VIDEO_ID           VARCHAR(100) NOT NULL,
  LIKES              INT          NOT NULL,
  FULL_JSON_RESPONSE VARCHAR(MAX) NOT NULL,
  TS_INSERT          DATETIME     NOT NULL
)
WITH (DISTRIBUTION = ROUND_ROBIN, HEAP)
GO

In addition, create a new stored procedure. This will be called to insert the data in the table.

CREATE PROC [dbo].[p_youtube_video_likes_insert] @p_item_id [VARCHAR](100), @p_video_id [varchar](100), @p_likes [INT], @p_full_json_response [VARCHAR](MAX) AS
BEGIN
    INSERT INTO [dbo].[youtube_video_likes](item_id, video_id, likes, full_json_response, ts_insert)
    SELECT @p_item_id, @p_video_id, @p_likes, @p_full_json_response, SYSDATETIME();
END
GO

2. Create Datasets

We will use the Copy activity to store the REST API responses in parquet files, so we need to create two datasets for the source and sink.

Dataset for source
Create a file and upload it into your storage account. This file is a supporting file as the data we want to store is coming from the REST API calls and not from another file. The content is the simpler JSON snippet we can have, an empty JSON.

Filename	oneLinerEmpty.JSON
Content	{}

Then, create a new dataset for this file:

Data store	Azure Data Lake Storage Gen2
Format	JSON
Name	oneLinerEmptyJSON
Linked service	the linked service for your “Primary” storage account or another one
File path	use the browse button to locate and select your uploaded JSON file. In this example, “filesystem001” is the container name, “socialmedia/supportingfiles” is the folder, and “oneLinerEmpty.JSON” is the file name

Dataset for sink
Create a new dataset that will be used to store the data in parquet format. This dataset will be parameterized because we will use it to create different files. Follow the same steps as above but select the Parquet format.

Name	parameterOutputParquet
Linked service	the linked service for your “Primary” storage account or another one
File path	leave empty as this will be parameterized
Import schema	None

Now it’s time to configure this dataset. In the Parameters tab, create 2 parameters as in this image:

P_FILE_PATH	String, no default value
P_FILE_NAME	String, no default value

In the Connection tab, we will use these parameters:

“filesystem001” is the name of the container: Replace accordingly.
Now we are ready to start developing the pipeline.

3. Pipelines

Create an auxiliary pipeline
The first pipeline is the one that uses the 2 datasets we created above and will only have a Copy activity. The objective of this pipeline is to receive a parameter and write that value in a parquet file. It will be called several times from the main pipeline, to store all the REST API responses.

Create a new pipeline, name it “Store Parameter in Parquet File” and create the following parameters, all with type String and no default value:

P_FILE_PATH

P_FILE_NAME

P_DATA

Add a Copy activity, name it “Store parameter P_DATA in Parquet” and configure the Source, Sink and Mapping tabs as shown in the following images.

Source dataset	the JSON file create in the preparation step
File path type	File path in dataset
Recursively	unchecked
Additional columns	create a new column “VALUE_COLUMN” and use the “Add dynamic content” of the VALUE field to add a reference to the parameter P_DATA (@pipeline().parameters.P_DATA). The goal of this additional column is to create a new column with the content of the received parameter, as if it were read from the input file. This is a way to inject whatever values we want in columns of the Copy activity.

Sink dataset	parameterOutputParquet
P_FILE_PATH	@pipeline().parameters.P_FILE_PATH
P_FILE_NAME	@pipeline().parameters.P_FILE_NAME

Map complex values to string

checked

Click on the “+ New mapping” to add a new mapping row. Rename the default column name “Column_1” to “VALUE_COLUMN” and map it to the same name in the right side of the mapping.

This pipeline is finished and should look like this:

Create the main pipeline

The main pipeline has these steps:

Get YouTube API Key from the secret in Azure Key Vault

Get the playlist ID for a given channel ID

Get the list of videos for a given playlist ID

For each video, get video statistics

Store results in the SQL Pool

In addition, the responses of all 3 YouTube REST API calls will also be stored in parquet files. This will allow us later to look at the data using different analytical/processing engines.

The complete pipeline and parameters will look like this:

Start by creating a new pipeline, name it “YouTube REST API example with API Key” or whatever you want and add the following parameters and default values:

P_SOCIAL_MEDIA_PLATFORM	YouTube
P_YOUTUBE_CHANNEL_ID	<YOUR-YOUTUBE-API-KEY>
P_FILE_OUTPUT_PATH	socialmedia/runs/

Add 3 Web activities, link them as shown in the previous picture and configure with the following properties. Only the non-default and important to mention properties are listed.

Web activity 1

Name	Get YouTube API Key from Key Vault
Secure output	checked
URL	<YOUR-YOUTUBE-API-KEY-SECRET-URL> e.g. https://mykeyvault.vault.azure.net/secrets/myKeyName/999cf304c27db?api-version=7.0
Method	GET
Authentication	Managed Identity
Resource	https://vault.azure.net

The secure output option will hide the results from the API call, in this case it will hide the value we’re getting from the secret, as per the best practices. When we try to look at the result, it looks like this:

To access the value of the secret, we can now use this expression: activity(‘Get YouTube API Key from Key Vault’).output.value

Note: any developer can turn off the secure output option and see the value of the secret. This can happen in a development environment, where the sensitivity is not the same as in a production environment. Make sure that no sensitive information can be seen in a production environment, by including revision rules that validate that no pipelines can be promoted if proper settings for security are not met.

Web activity 2

Name	Get playlistID
Secure input	checked
URL	@concat( ‘https://youtube.googleapis.com/youtube/v3/channels?part=contentDetails&id=‘, pipeline().parameters.P_YOUTUBE_CHANNEL_ID, ‘&key=’, activity(‘Get YouTube API Key from Key Vault’).output.value )
Method	GET

The secure input option will hide the values sent to this activity, in this case it will hide the API Key used in the URL. If we try to see the runtime input, all we’ll see is this:

We now have the playlist ID for a channel, so we can request a list of videos for that playlist.

Web activity 3

Name	Get Playlist Videos
Secure input	checked
URL	@concat( ‘https://youtube.googleapis.com/youtube/v3/playlistItems?part=snippet,contentDetails,status&maxResults=50&playlistId=‘, activity(‘Get playlistID’).output.items[0].contentDetails.relatedPlaylists.uploads, ‘&key=’, activity(‘Get YouTube API Key from Key Vault’).output.value )
Method	GET

The secure input option is also checked. Notice what we would see if this was not the case:

We want to store the responses of the 2 YouTube REST API calls in parquet format. We already have a pipeline to do that so we will now invoke it.

Add 2 Execute Pipeline activities to the canvas, link them as show in the complete pipeline design and configure as follows.

Execute Pipeline 1

Name	Store playlistID result in Parquet
Invoked pipeline	Store Parameter in Parquet File, or any other name you gave to the auxiliary pipeline
Wait on completion	checked
P_FILE_PATH	@concat( pipeline().parameters.P_FILE_OUTPUT_PATH, pipeline().parameters.P_SOCIAL_MEDIA_PLATFORM )
P_FILE_NAME	@concat( ‘getPlaylistID-‘, utcnow(), ‘.parquet’ )
P_DATA	@string(activity(‘Get playlistID’).output)

This activity will store the REST API response in a parquet file in the storage account. The file name will be similar to “getPlaylistID-2021-04-23T13:13:19.4426957Z.parquet”.

Execute Pipeline 2

Name	Store Playlist Videos result in Parquet
Invoked pipeline	Store Parameter in Parquet File, or any other name you gave to the auxiliary pipeline
Wait on completion	checked
P_FILE_PATH	@concat( pipeline().parameters.P_FILE_OUTPUT_PATH, pipeline().parameters.P_SOCIAL_MEDIA_PLATFORM )
P_FILE_NAME	@concat( ‘getPlaylistVideos-‘, utcnow(), ‘.parquet’ )
P_DATA	@string(activity(‘Get Playlist Videos’).output)

This activity will store the REST API response in a parquet file in the storage account. The file name will be similar to “getPlaylistVideos-2021-04-23T13:13:22.3769063Z.parquet”.

Now that we have a list of videos, we need to get the statistics for each one of them. To accomplish that, we add a ForEach activity to the pipeline, connect it to the output of the “Get Playlist Videos” activity and configure as:

Name	For Each Video
Sequential	checked, but can also run in parallel
Items	@activity(‘Get Playlist Videos’).output.items

Next, we add the remaining activities inside the loop, that will run for each video in the playlist. At the end it will look like this:

Web activity

Name	Get Video Stats
Secure input	checked
URL	@concat( ‘https://youtube.googleapis.com/youtube/v3/videos?part=snippet,contentDetails,statistics&id=‘, item().snippet.resourceId.videoId, ‘&key=’, activity(‘Get YouTube API Key from Key Vault’).output.value )
Method	GET

This activity will request some statistics for a given video. Statistics include the number of views, likes, comments, and so on.

Execute pipeline activity

Name	Store Video Stats in Parquet
Invoked pipeline	Store Parameter in Parquet File, or any other name you gave to the auxiliary pipeline
Wait on completion	checked
P_FILE_PATH	@concat( pipeline().parameters.P_FILE_OUTPUT_PATH, pipeline().parameters.P_SOCIAL_MEDIA_PLATFORM )
P_FILE_NAME	@concat( ‘getVideoStats-‘, utcnow(), ‘.parquet’ )
P_DATA	@string(activity(‘Get Video Stats’).output)

This activity will store the REST API response in a parquet file in the storage account. The file name will be similar to “getVideoStats-2021-04-23T13:13:25.6924926Z.parquet”.

SQL Pool Stored Procedure

Name	SQL pool SP – Insert data
Azure Synapse dedicated SQL pool	select the SQL pool where you created the table and stored procedure
Stored procedure name	[dbo].[p_youtube_video_likes_insert], or whatever you called it

Stored procedure parameters (use the import button to list them):

p_item_id	@item().id
p_video_id	@item().snippet.resourceId.videoId
p_likes	@activity(‘Get Video Stats’).output.items[0].statistics.likeCount
p_full_json_response	@string(activity(‘Get Video Stats’).output)

The call to the stored procedure will make sure that some individual fields (item id, video id, count of likes) as well as the full response are stored in a table in a SQL pool. In a real scenario, we would need to create a model to hold the data, as per the requirements.

Note: if you use Azure Data Factory, the same goal can be achieved using the “Stored procedure” activity, but you would need to use a linked service to the target SQL pool.

We can now run the full pipeline and validate that several files were created in the storage account and some data inserted in the SQL pool table.

4. Look at the data

Because we stored our data in parquet files and tables in a SQL pool, we have several options to use to manipulate the data.
This table summarizes some of the options using Azure Synapse Analytics:

	Data format and store
Compute options	Parquet files in ADLS Gen2	Tables in SQL Pools
SQL Serverless	Y (*)	N
SQL Provisioned	Y	Y (*)
Spark Cluster	Y (*)	Y

Below we will look at short examples of the options marked with (*). The goal is not to go deep into technical details of JSON handling but rather have an idea of the possibilities to use.

a) SQL Serverless on Parquet files in ADLS Gen2

We can use the OPENROWSET function to read a parquet file and show its contents.

-- check file as-is
SELECT * FROM OPENROWSET(
    BULK 'https://mydatalake.dfs.core.windows.net/filesystem001/socialmedia/runs/YouTube/getVideoStats-2021-04-23T12:37:12.2853807Z.parquet',
    FORMAT='PARQUET'
) AS [result]

Since we have a JSON file, we can take advantage of the OPENJSON function to parse the elements in the first level of the file.

-- OPENJSON in action
SELECT [key], [value], [type]
FROM OPENROWSET(
    BULK 'https://mydatalake.dfs.core.windows.net/filesystem001/socialmedia/runs/YouTube/getVideoStats-2021-04-23T12:37:12.2853807Z.parquet',
    FORMAT='PARQUET'
) as [result2]
CROSS APPLY OPENJSON(VALUE_COLUMN)

You can read more about OPENROWSET and OPENJSON here:

How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics

Query JSON files using OPENJSON

b) SQL Provisioned on Tables in SQL Pools

OPENJSON is also available on the SQL Provisioned engine.

-- check data as-is
SELECT *, isjson(full_json_response) as 'IS_JSON'
FROM [dbo].[youtube_video_likes]
GO

SELECT *
FROM [dbo].[youtube_video_likes]
cross apply openjson(full_json_response)
where video_id = 'A212x5XXXXX'
GO

You can read more about OPENJSON for SQL Provisioned and JSON functions here:

OPENJSON (Transact-SQL)

JSON Functions (Transact-SQL)

c) Spark Cluster on Parquet files in ADLS Gen2

We can also use a Spark Cluster as a Spark processing engine to look at the data.

For this task, we create a new notebook. By default, the selected language is Python (Spark) and we can leave it like that, although we can use different languages by using Cell Magic Commands like %%pyspark or %%sql. We also need to attach the notebook to an existing Apache Spark Pool.

First, we load the data into a dataframe:

%%pyspark
df = spark.read.load('abfss://filesystem001@mydatalake.dfs.core.windows.net/socialmedia/runs/YouTube/getVideoStats-2021-04-23T12:37:12.2853807Z.parquet', format='parquet')
display(df)

Since we have the data in a dataframe, we can use pyspark to manipulate the data:

%%pyspark
from pyspark.sql.functions import from_json, col
from pyspark.sql.types import StructType, StructField, StringType, ArrayType

statistics_schema = StructType([
    StructField('likeCount', StringType(), True)
])

items_schema = StructType([
    StructField('id',      StringType(), True),
    StructField('statistics', statistics_schema, True)
])

schema = StructType([
    StructField('kind',  StringType(),            True),
    StructField('etag',  StringType(),            True),
    StructField('items', ArrayType(items_schema), True)

])

display(
    df.withColumn('kind',     from_json(col('VALUE_COLUMN'), schema).getItem('kind'))
      .withColumn('etag',     from_json(col('VALUE_COLUMN'), schema).getItem('etag'))
      .withColumn('video_id', from_json(col('VALUE_COLUMN'), schema).getItem('items')[0].getItem('id'))
      .withColumn('likes',    from_json(col('VALUE_COLUMN'), schema).getItem('items')[0].getItem('statistics').getItem('likeCount'))
      .select('kind', 'etag', 'video_id', 'likes')
)

If SQL is the preferred way, we can create a table from the dataframe and use the get_json_object function to parse the json strings:

%%pyspark
df.write.mode("overwrite").saveAsTable("default.youtube_video_likes")

%%sql
SELECT get_json_object(VALUE_COLUMN, '$.kind') as `kind`,
       get_json_object(VALUE_COLUMN, '$.etag') as `etag`,
       get_json_object(VALUE_COLUMN, '$.items[0].id') as `videoId`,
       get_json_object(VALUE_COLUMN, '$.items[0].statistics.likeCount') as `likes`
FROM default.youtube_video_likes

Summary
It is easy to build an Azure Synapse Pipeline that gets information from social media.
In the example shown in this article, on how to get some data from YouTube, we covered the ingestion, storing and consumption using different techniques, and more can be used. There is not “the correct” way to do it but rather a way that makes sense in your environment.

You can find all the code in the attachment file below, including the pipelines and datasets, SQL scripts, notebooks and supporting files, as well as on GitHub under the repository name SynapseIntegrateSocialMedia.

« Older Entries

Next Entries »

Optimize ROI by migrating to Azure Database for MySQL

Streamline application development on Azure

Azure Database for MySQL eliminates cost, provides operational savings, and improves business agility

ESG’s findings

Cisco Releases Security Updates for Multiple Products

CISA Releases Guidance: IPv6 Considerations for TIC 3.0

CISA, FBI, and NSA Release Joint Cybersecurity Advisory on Conti Ransomware

Azure Synapse (Pipelines) for Social Media – YouTube example

Recent Posts

Recent Comments

Archives

Categories

Meta

We look forward to meeting you

Optimize ROI by migrating to Azure Database for MySQL

Streamline application development on Azure

Azure Database for MySQL eliminates cost, provides operational savings, and improves business agility

ESG’s findings

Cisco Releases Security Updates for Multiple Products

CISA Releases Guidance: IPv6 Considerations for TIC 3.0

CISA, FBI, and NSA Release Joint Cybersecurity Advisory on Conti Ransomware

Azure Synapse (Pipelines) for Social Media – YouTube example

Recent Posts

Recent Comments

Archives

Categories

Meta

We look forward to meeting you

CISA, FBI, and NSA Release Joint Cybersecurity Advisory on Conti Ransomware