This article is contributed. See the original author and article here.
We are excited to announce the public preview support for Cosmos DB endpoint from IoT Hub in November 2022. You can now stream your data directly from IoT Hub to Cosmos DB.
Many IoT solutions require extensive downstream data analysis and pushing data into hyperscale databases. For example, IoT implementations in manufacturing and intelligent transport systems require hyperscale databases with extremely high throughput to process the continuous stream of data. Traditional SQL based relational databases cannot scale optimally and also become expensive once data scale increases. Cosmos DB is best suited for such cases where the data needs to be analyzed while it is being written!
Until today, there was no direct way to get IoT device data to Cosmos DB. You had to either setup a custom application or use other tools like Azure Functions or Azure Stream Analytics to collect the streaming data and then write to Cosmos DB. This was not only an additional overhead, but also increased the overall cost of the solution. You don’t need to do this any longer – with the new release, you can:
- Configure a Cosmos DB account as a custom endpoint within the IoT Hub
- Use Synthetic Partition Keys to auto-generate partition keys based on your business logic
- Create routing rules to send data directly to the defined Cosmos DB endpoint
How to configure a Cosmos DB endpoint?
Setting up a Cosmos DB endpoint is pretty straightforward. In the Azure portal, open the IoT Hub blade that you have setup and navigate to the Hub settings. Select “Message Routing” on the left pane and click on “Custom endpoints” and choose “Cosmos DB” from the dropdown list. Once you select Cosmos DB as your preferred endpoint, the below screen appears – configure the endpoint and click on Create to complete the process.
Once you have setup the Cosmos DB endpoint, you can use it to setup routes, create routing rules and use all the other functionalities available in message routing within IoT Hub.
When should I use Synthetic Partition Keys?
IoT Hub supports writing to Cosmos DB in JSON (if specified in the message content-type) or as Base64 encoded binary. You can also enable Synthetic Partition Keys to create logical partitions of your data automatically based on your business logic and data growth rate.
As Cosmos DB is a hyperscale datastore, all data/documents written to it must contain a field that represents a logical partition. The partition key property name is defined at the Container level and cannot be changed once it has been set. Each logical partition has a maximum size of 20GB. To effectively support high-scale scenarios, you can enable Synthetic Partition Keys for the Cosmos DB endpoint and configure them based on your estimated data volume. For example, in manufacturing scenarios, your logical partition might be expected to approach its max limit of 20 GB within a month. In that case, you can define a Synthetic Partition Key which is a combination of the device id and the month. This key will be automatically added to the partition key field for each new Cosmos DB record, ensuring logical partitions are created each month for each device.
Setting up of Cosmos DB endpoint from the Azure portal is available for public preview in November 2022. You can learn more about the functionality from our documentation. We are working towards adding support for setup via CLI and general availability of Cosmos DB custom endpoints – this is coming soon! You can also reach out to us via various support tools if you have any questions or require technical support.
We are looking forward to your feedback!
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.