This article is contributed. See the original author and article here.
Today, we are announcing the public preview of the ability to use custom Docker containers in Azure Machine Learning online endpoints. In combination with our new 2.0 CLI, this feature enables you to deploy a custom Docker container while getting Azure Machine Learning online endpoints’ built-in monitoring, scaling, and alerting capabilities.
Sample deployment with TensorFlow Serving
To deploy a TensorFlow model with TensorFlow Serving, first create a YAML file:
name: tfserving-endpoint type: online auth_mode: aml_token traffic: tfserving: 100 deployments: - name: tfserving model: name: tfserving-mounted version: 1 local_path: ./half_plus_two environment_variables: MODEL_BASE_PATH: /var/azureml-app/azureml-models/tfserving-mounted/1 MODEL_NAME: half_plus_two environment: name: tfserving version: 1 docker: image: docker.io/tensorflow/serving:latest inference_config: liveness_route: port: 8501 path: /v1/models/half_plus_two readiness_route: port: 8501 path: /v1/models/half_plus_two scoring_route: port: 8501 path: /v1/models/half_plus_two:predict instance_type: Standard_F2s_v2 scale_settings: scale_type: manual instance_count: 1 min_instances: 1 max_instances: 2
Then create your endpoint:
az ml endpoint create -f endpoint.yml
And that’s it! You now have a scalable TensorFlow Serving endpoint running on Azure ML-managed compute.
- Read our documentation
- See the sample with TorchServe
- Learn more about our Azure-built inference images.
- Look out for future samples showing ML.NET and R support
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.