This article is contributed. See the original author and article here.

This post was authored by Bruce Nelson, Senior Solutions Architect at Databricks and Clinton Ford, Staff Partner Marketing Manager at Databricks


 


Overview


Healthcare organizations are improving the patient experience and delivering better health outcomes with analytic dashboards and machine learning models on top of existing electronic health records (EHR), digital medical images and streaming data from medical devices and wearables. Azure Databricks and Delta Lake make it easier to work with large clinical datasets to identify top patient conditions.


 


Using Delta Lake to build a comorbidity dashboard


Simulated EHR data are based on roughly 10,000 patients in Massachusetts and generated using the Synthea simulator. Our ETL notebook ingests and de-identifies our data, then prepares it for our visualization notebook. We create visualizations and a simple dashboard that show the top conditions (comorbidities) in our real world data and also analyze the correlation between any two conditions specified by the user.


 


Modern Analytics for Healthcare and Life Sciences (HLS) .png


 


Extract, transform and load (ETL)


To begin, we use pyspark to read EHR data from comma-separated values (CSV) files, de-identify patient personally identifiable information (PII) and write to Delta Lake for analysis. Using Delta Lake is a best practice for ingestion, ETL and stream processing as an open source format with support for ACID transactions, faster processing with Delta Engine and easy integration with other Azure services for additional use cases.


 


Data Ingestion, Streaming and ETL for HLS.png


 


 


EHR data analysis and comorbidity dashboard


In this notebook we visualize top conditions in the database and create a simple dashboard to analyze the correlation between any two conditions specified by the user. You can share this notebook as a dashboard following these instructions.


 


Comorbid Condition Browser.png


Next steps


For additional background on this use case see this blog post. See live demos or get hands on at an Azure Databricks event. Go even deeper with this 3-part webinar training series to operationalize machine learning models for your own organization.

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.