DevOps for Data Science – Part 2 – Defining DevOps

This article is contributed. See the original author and article here.

I’m wading into treacherous waters here in my series on DevOps for Data Science. Computing terms often defy explanation, especially newer ones. While “DevOps” or Developer Operations has been around for a while, it’s still not as mature a term as, say, “Relational Database Management System (RDBMS)”. That term is well known, understood, and accepted. (It wasn’t when it came out). Whatever definition I give you for DevOps will be contested – and I’m OK with that. Nothing brings out a good flame-war like defining a new technical term.


Regardless of the danger, we have to define the terms we’re using. Andrew Shafer and Patrick Debois used the term first, from what I can tell,  in 2008 at a conference on Agile – Agile being a newer term as well.  They posited in their talk the breaking down of barriers between developers, operations, and other departments. Since then, the term DevOps has come to mean much more.


Think about getting software in a user’s hands (or another system’s…er, hands). Working sequentially, the process looks something like this:


Design -> Environment Setup -> Code -> Build -> Test -> Package -> Release -> Monitor


With a few exceptions, that’s how software is done. Data Science is usually somewhere in there during the Code phase.


In most cases, there are clearly defined boundaries for what gets done by whom. For instance, developers write the code after the business sends over requirements. The deployment team handles packaging and releasing. And the operations team (Ops) handles monitoring and updating. Maybe it’s a little different in your organization, but in general, each team has an area they are responsible for. And that’s mostly all they focus on.


We’re all busy. I barely have enough time in my day to write code and the commensurate documentation, much less think about other parts of the process. But we have to think about the phases of software development that follow our own. 


Imagine if Equifax, as the business owners were requesting the software to be written, had said “And remember, we need to build right into the software things that require the right security to be in place. And let’s make sure we have a plan for when things go wrong.” Imagine if the developers had included a patch-check for the frameworks they use to ensure everything was up to date. Imagine if the Ops team cared that proper security testing is done way back in the development stage. Your identity might still be safe. 


And that’s my definition of DevOps: At its simplest, DevOps is including all parties involved in getting an application deployed and maintained to think about all the phases that follow and precede their part of the solution. That means the developer needs to care about monitoring. Business owners need to care about security. Deployment teams need to care about testing. And everyone needs to talk, build the process into their tools, and follow processes that involve all phases of the release and maintenance of software solutions.


That also means DevOps isn’t a tool or even a team – it’s a thought process. Sure, there are tools and teams that help implement it, but if only a few people are part of DevOps, then you don’t have DevOps.


In this series, I’ll cover more about the intersection of DevOps and Data Science, and in particular, the things you need to be careful about in implementing DevOps for Data Science. Use the references below to inform yourself, as a Data Scientist, what DevOps is. I’ll show you how to integrate it into your projects as we go.


For Data Science, I find this progression works best – taking these one step at a time, and building on the previous step – the entire series is listed here – I’ll be working on these articles throughout this series:

  1. Infrastructure as Code (IaC)

  2. Continuous Integration (CI) and Automated Testing

  3. Continuous Delivery (CD)

  4. Release Management (RM)

  5. Application Performance Monitoring

  6. Load Testing and Auto-Scale

In the articles in this series that follows, I’ll help you implement each of these in turn.

(If you’d like to implement DevOps, Microsoft has a site to assist. You can even get a free offering for Open-Source and other projects:

Enabling Bot Telemetry for Application Insights

Enabling Bot Telemetry for Application Insights

This article is contributed. See the original author and article here.

Hello bot developers,


I couldn’t wait more to write another blogpost, after my blog on bots: “How bots work“. Today’s subject is an important one, where we should always intend to use, if we want to understand the insights of our bots.


You may already know how to connect your bot to the application insights, don’t you? This will help the “Azure Bot Services” to produce analytics for your bot. To do this, you are locating to your application insights resource get the keys from there,  and copy them over to your Bot Resource’s, analytics settings. That simple.. What if you want to go beyond that, and want your bot application to produce Telemetry too? Keep tight: With version 4.2 of Bot Framework SDK, we now have “TelemetryLoggerMiddleware” built into the “Bot.Builder” namespace. 


This middleware, simply use “Microsoft.ApplicationInsights.TelemetryClient” libraries to add Telemetry to your application insights project, that you have configured in your “appsettings.json” file. See here, how to wire this middleware up to your bot. You will also notice a switch on “TelemetryLoggerMiddleware” to enable/disable activity logging called “logActivityTelemetry.


Well, It seems easy to use. Let’s check a sample stack below, on how these middleware is calling up other libraries. Below is the sample stack trace, when we receive an activity, and how it calls into “Microsoft.ApplicationInsights.TelemetryClient” classes. I am using this stack for one important reason. First let’s check that stack –>





         Microsoft.Bot.Builder.ApplicationInsights.dll!Microsoft.Bot.Builder.ApplicationInsights.BotTelemetryClient.TrackEvent(string eventName, System.Collections.Generic.IDictionary<string, string> properties, System.Collections.Generic.IDictionary<string, double> metrics)












So let’s come to that important reason:

The top function on the above callstack “Microsoft.Bot.Builder.Integration.ApplicationInsights.Core.TelemetryBotIdInitializer.Initialize”, has a very important task. It is initializing your telemetry  fields, especially:  “User” and “Session” which are quite important when you are analyzing your Application Insights. Note that these values are calculated like below. At least for now:


sessionId = StringUtils.Hash(conversationId);

channelId = (string)body[“channelId”];

userId = (string)from[“id”];


telemetry.Context.User.Id = channelId + userId; –> It’s a combination of “ChannelID” + “From” field of an “Activity” Object.

telemetry.Context.Session.Id = sessionId; –> This is has of “conversationID” of the “Activity” Object.


What does all these means?


This means, if you enable Bot Telemetry logger, and check your telemetry on your “Azure Application Insights”, you will see actual Users, and Actual Sessions, where a User represents a “Bot User” on channel, and a Session Represent a “Conversation”.


In the below picture, we are getting the contents of “dependencies” table on “Azure Portal” –> “Application Insights Project” –> “Logs” blade. A couple of lines at the bottom, doesn’t include any “Session” / “User ID”, for the dependency. They are generated when we don’t use “Bot Telemetry Logger Middleware”, that is why we cannot associate them to any Conversation. But the “upper lines” are tracked dependencies by Bot Telemetry Logger middleware, which now can be associated to actual “Conversations” and “Users”. I think, this is fantastic!




This way, we can go to  “Azure Portal” –> “Application Insights Project” –> “Sessions” blade, to see “Active Sessions”, which represent a conversation and drill into these sessions, to see what conversation has done:






Isn’t it great? Or am I overreacting :)

Likewise, in the “Users Blade” , you can see the actual users, and their timeline:




This is like a summary of a great change, coming up with only 5 lines of code, on Bot Telemetry. And hope you were not aware of this earlier, so that you can feel the same excitement.


Joke aside, that seems to be an important change on our approach to Telemetry.


Hope you like my blogpost,

Stay tuned for the next one,


How bots work

How bots work

This article is contributed. See the original author and article here.

Hello bot developers,


I have recently decided to write blogposts about Bot framework technology to reflect my perspective on the subject, provide ideas on how it works and to troubleshoot common scenarios. The best start for this, is to make a good definition of “Bot Framework SDK”, and in general “Azure Bot Service”, that Microsoft offers.


Apart from the fancy description here, a bot can be defined as an application , which uses “Bot Builder” and  “Bot Connector” libraries, to communicate through the Connector services and channels. While doing that, the dependent services being used on that bot application, like “LUIS” (Language Understanding Intelligent Service) or “QnA Service“, makes the bot behave like an intelligent entity, since it will be able to understand your intentions easily and reply to them. All the communication between the human and the bot, is packaged into a serializable JSON based objects called “Activity“, and it flows through different channels. “Bot Builder” libraries create a reply “Activity” to these incoming “Activity” objects, considering the state of “Conversations” and “Dialogs”, to provide you meaningful suggestions based on your previous interactions. Most popular type of an “Activity” is a “Message”, but activities can also provide meta information, like adding/removing users to a conversation. When activity reach out to bot, it will be put inside a “TurnContext” object, together with the state information of User and dialog. A Turn context , will be processed by the bot code, and it will be valid until the Turn of the bot is completed. You can find the activity processing stack with a nice diagram here if you want to see more on the activity processing.  Note that, these diagrams focus on the bot side of the communication and abstract the channel/connector side.


 To get more acquainted with the conversation based Bot communication, I can recommend you to start reviewing the REST API’s documented here. Also, you can surf inside the Bot Builder 4.0 Namespace, to understand what an activity object is, or how a TurnContext object relates with an activity, etc.


We can finish the definition part here and If you want to play around with bots, you can start creating bots, and try understanding how they work. Assuming that you have an active Azure Subscription, you can deploy your bot to Azure or alternatively you can use our “Bot Framework Emulator” tool, to debug to bots locally. For the latter, I can give you a recipe below, to start with:


#1 – Clone the Bot Builder samples repo to get all available samples. In my posts, I will mostly use “dotnet core” based samples, since I am more used to work with that language. But we have other SDK’s available and associated samples with these SDK’s as well.


#2 – Once you clone the samples repo, you can locate to the easiest sample, which is “02.echo-bot” which just echoes back, whatever you write to the bot. In the GitHub article, “To try this sample” section defines how to build and run this sample.

Inside all the bot samples, you will find a folder called “Bots” which contains the bot classes, that are derived from ActivityHandler classes. These classes are injected as a dependency in the “Startup.cs” file. Once you find your bot class you can try putting a breakpoint to “OnMessageActivityAsync()” function which represent the bots turn.


#3- After you run the sample, you can go to “Bot Framework Emulator” and start communicating with your bot. you can visit this article here, if you want to understand how you can define your bot endpoint and connect to your bot using the bot framework emulator. Once you connect, you should be observing that your breakpoint is hit. You can check the contents of “TurnContext” object with your debugger, and try understanding the structure of an activity.




If you check more complex samples, you can see that the bot classes also implement other methods for activity handler class. Ie. one of my favorites “21.corebot-app-insights” sample also implement “OnTurnAsync” method or activity handler class. As you see, my ultimate recommendation for a bot SDK developer in this blogpost is: get more familiar with the samples, since they represent many different use cases for bot development. 


What do you think can be the next step? That’s right, you can now consider deploying your bot code to Azure, and start discovering the endless opportunities of cloud technologies :) Here is our standard documentation on deploying your bot to Azure. I think that is all for today. Hope you enjoyed the read.


Tune for the next blogpost,

See you soon,


Orphaned transactions and distributed deadlocks

Orphaned transactions and distributed deadlocks

This article is contributed. See the original author and article here.

Orphaned transactions and distributed deadlocks happen when there is a session established to the database without any current running requests, but there was a request (one query or more) that holds locking on database object(s).


Orphaned transactions can cause a lot of locking and blocking on the database, and usually it is related to the application and how it is code is  written “badly” or in a way that meets the atomicity of the transactions: “commit all or rollback all”.


I will give an example here, trying by it to simplify the idea:

I created a very simple and small table with two rows :



CREATE TABLE [dbo].[testtable](
       [id] [int] IDENTITY(1,1) NOT NULL,
       [name] [varchar](100) NULL,
       [id] ASC

set identity_insert testtable on 
insert into testtable (id,name) values (1, 'row #1'),(2,'row #2')




Also, I created a  simple C# desktop application, contains two forms: one to search for IDs and the other one is for deleting IDs:








The C# code of the delete button:


private void btnDelete_Click(object sender, EventArgs e)

            System.Data.SqlClient.SqlDataAdapter myadapter = new System.Data.SqlClient.SqlDataAdapter();
           SqlConnection myconn = new SqlConnection("Data;Initial Catalog=testdb;Persist Security Info=True;User ID=xx;Password=xx");
            myadapter.DeleteCommand = new System.Data.SqlClient.SqlCommand("DELETE FROM[dbo].[testtable] WHERE[id] = '" + txtid.Text + "'", myconn);
           myadapter.DeleteCommand.Connection.Open();//open connection with database
           myadapter.DeleteCommand.Transaction=myconn.BeginTransaction(); //begin transaction 

           myadapter.DeleteCommand.ExecuteNonQuery();//the request-the query executed on the database with milliseconds
            DialogResult delbox;
            delbox = MessageBox.Show("are you sure you want to delete?", "delete ID", MessageBoxButtons.OKCancel); //dialog box with Ok Cancel option , the user can wait before deciding what to choose
            if (delbox == DialogResult.OK)
                myadapter.DeleteCommand.Transaction.Commit();//commit the transaction , locking will be released

            else { myadapter.DeleteCommand.Transaction.Rollback();//rollback , locking will be released





The C# code of the search button:


private void btnsearch_Click(object sender, EventArgs e)
            System.Data.SqlClient.SqlDataAdapter myadapter = new System.Data.SqlClient.SqlDataAdapter();

            SqlConnection myconn = new SqlConnection("Data;Initial Catalog=testdb;Persist Security Info=True;User ID=xx;Password=xxx");
            myadapter.SelectCommand = new System.Data.SqlClient.SqlCommand("select name FROM[dbo].[testtable] WHERE[id] = '" + txtid.Text + "'", myconn);
            System.Data.DataSet mydataset = new DataSet();
            myadapter.SelectCommand.Connection.Open();//open connection

                myadapter.SelectCommand.ExecuteScalar();//the request will be executed on the database , it should take no time for two rows table. 

            txtName.Text = mydataset.Tables[0].Rows[0][0].ToString();

                myadapter.SelectCommand.Connection.Close();//close connection





Now when I run two instances of the application:


Here I want to delete row of ID 1, I wrote the application code with a message box as below snapshot:



while the message box was still there:


I executed dbcc opentran() ,  and the oldest active transaction was:

Transaction information for database ‘testdb’.

Oldest active transaction:

    SPID (server process ID): 104

    UID (user ID) : -1

    Name          : user_transaction

    LSN           : (1108:112408:2)

    Start time    : Sep 15 2020  6:44:15:733PM

    SID           : 0x28bb43e4bdc050459a623ea82e054fa2

DBCC execution completed. If DBCC printed error messages, contact your system administrator.


The oldest active transaction on the database (of SPID 104)  did not appear on the sys.dm_exec_requests DMV result:




Meanwhile, with the other instance of the application, I searched for the same row of ID 1:



While I was trying to search for the ID 1, the application thrown a timeout exception as below snapshot:



The timeout error caused by a blocking by the orphaned SPID 104:



the issue persisted, until the user chooses between okay or cancel, in other words between rolling back or commit the delete transaction. 


and the result of the dbcc opentran() became different, there was no open transactions:

No active open transactions.

DBCC execution completed. If DBCC printed error messages, contact your system administrator.


This is a simple example, the issue may become a complete chain of blocking with a major impact in some other scenarios.




How can you solve this issue?


One of the ways is to kill the SPID of the transaction during the issue occurrence:


By running the command :

Kill 82


But here , be aware that you are still not able to know what is the transaction itself, what it was doing and what is the impact of the rolling back.


When I killed the process , the deletion failed with error:



The error is : “An existing connection was forcibly closed by the remote host”:

Applications usually use a friendly custom error messages, error page redirection and try catch blocks. But in all cases killing of the process at least will waste the user effort and time spent in filling or updating the data for example.


And killing the SPID sometimes is not the proper solution if the occurrence of the issue is happening frequently.


Please note that in my example here, the delete query will not fail with a query timeout; because it is already executed in milliseconds , and finding it on query store is hard because it may not appear on “consuming resources” queries or on “queries of the high wait time”.


after identifying the issue and where the blocking transactions come from, using read uncommited isolation level or query hints “like with(nolock)” may decrease the impact of it.

but one of the solutions is in changing the application C# code, as below:



        private void btnDelete_Click(object sender, EventArgs e)

            System.Data.SqlClient.SqlDataAdapter myadapter = new System.Data.SqlClient.SqlDataAdapter();
           SqlConnection myconn = new SqlConnection("Data,3342;Initial Catalog=testdb;Persist Security Info=True;User ID=myuser;Password=xxxx;");
            myadapter.DeleteCommand = new System.Data.SqlClient.SqlCommand("DELETE FROM[dbo].[testtable] WHERE[id] = '" + txtid.Text + "'", myconn);
            DialogResult delbox;
            delbox = MessageBox.Show("are you sure you want to delete?", "delete ID", MessageBoxButtons.OKCancel);
            if (delbox == DialogResult.OK)
                //myadapter.DeleteCommand.Transaction = myconn.BeginTransaction(); //= new SqlTransaction();





here open and close connection will start and end inside the IF block , only when the user selects “Okay” button of the Dialog Box.


Mount Blob storage on Linux VM using Managed Identities or Service Principal with Blobfuse

Mount Blob storage on Linux VM using Managed Identities or Service Principal with Blobfuse

This article is contributed. See the original author and article here.


You want to mount the Azure Blob storage container on Linux VM and access the data using either Managed Identities or Service Principal.



Azure storage account

Linux VM



To mount the Azure Blob storage container as a filesystem on Linux VM, you can make use of Blobfuse which allows you to access the existing data in your storage account through the Linux filesystem.

Mounting of storage account using the Storage account key has been explained in our article:

Below are the steps to mount the storage account either using Managed Service Identity or using Service Principal.


Step 1:

Configure the Linux software repository for Microsoft products using the below command:

For Ubuntu:


sudo dpkg -i packages-microsoft-prod.deb

sudo apt-get update



sudo rpm -Uvh


Note: Change the URL accordingly based on the Ubuntu version and RHEL Distribution that you’re using.


Step 2:

Install the blobfuse in your Linux VM.

For Ubuntu:

sudo apt-get install blobfuse



sudo yum install blobfuse


Step 3:

Blobfuse requires a temporary path in the file system to buffer and cache any open files.

You can make use of SSD disks available on your VMs for blobfuse. You can also make use of ramdisk and create a directory for blobfuse.

To use SSD as a temporary path, below is the command:

sudo mkdir /mnt/resource/blobfusetmp -p

sudo chown <youruser> /mnt/resource/blobfusetmp


Or to use ramdisk for the temporary path, below is the command:

sudo mkdir /mnt/ramdisk

sudo mount -t tmpfs -o size=16g tmpfs /mnt/ramdisk

sudo mkdir /mnt/ramdisk/blobfusetmp

sudo chown <youruser> /mnt/ramdisk/blobfusetmp



Step 4:

Blobfuse requires the authentication methods and credentials to be configured either in a configuration file or as an environment variable.

To create a configuration file and to restrict the access to the file so that no other users can read it, use the below commands:

touch ~/fuse_connection.cfg

chmod 600 fuse_connection.cfg


To mount the storage to the VM, you can make use of either System Assigned Managed Identity or User-assigned managed Identity or Service Principal.

  1. Using System Assigned Managed Identity

    1. To use this configuration, please enable ‘System-assigned’ managed identity on the Linux VM that you’re using as shown below:Sindhu_Hegde_12-1603782155821.png

    2. Ensure that the Object ID or the system managed identity is given sufficient RBAC role at the storage account level.

Note: Please make sure that you give minimum of ‘Reader’ and ‘Storage Blob Data Reader’ role to the managed identity at the storage account level.

You can assign these roles here: Storage account -> Access Control (IAM) -> Add role assignment and selecting Virtual Machine in ‘Assign access to’ option as shown below:



2. Using User-Assigned Managed Identity

i. If you’re using User assigned managed identity, please add the identity in ‘User assigned’ configuration of your Linux VM as shown below:


ii. Ensure that the managed identity is given necessary RBAC roles at the storage account level as shown below:


For both scenarios, update the configuration file that was created earlier with the storage account credentials and mention authType as ‘MSI’ as shown below:

accountName <storage account name>

authType MSI

containerName <container name>



3. Using Service Principal

i. Ensure that SPN is given sufficient RBAC roles at the storage account level.

ii. Update the configuration file with Storage account details and Service Principal details. Also, the authType for Service Principal authentication would be SPN as shown below:

accountName <storage account name>

authType SPN

servicePrincipalClientId <Client ID or Application ID of the Service Principal>

servicePrincipalTenantId <Tenant ID of the Service Principal>

containerName <container name>


iii. The client secret for your application or the Service Principal must be saved as an Environment Variable and should not be mentioned in the configuration file. It will be saved as AZURE_STORAGE_SPN_CLIENT_SECRET. Please save it in /etc/environment in the below format:




Step 5:

Create an empty directory for mounting using the below command:

mkdir ~/mycontainer


Step 6:

To mount the blob storage using blobfuse, run the below command which will mount the specified container in the configuration file onto the empty directory that we created:

sudo blobfuse ~/mycontainer –tmp-path=/mnt/resource/blobfusetmp  –config-file=/path/to/fuse_connection.cfg -o attr_timeout=240 -o entry_timeout=240 -o negative_timeout=120


Note: To allow access to all users, please use the switch -o allow_other while mounting.


Once the container is mounted, you can access the blobs using the regular file system APIs in your Linux VM.



Hope that helps!