This article is contributed. See the original author and article here.

Today, we are announcing the public preview of the ability to use custom Docker containers in Azure Machine Learning online endpoints. In combination with our new 2.0 CLI, this feature enables you to deploy a custom Docker container while getting Azure Machine Learning online endpoints’ built-in monitoring, scaling, and alerting capabilities.


 


Below, we walk you through how to use this feature to deploy TensorFlow Serving with Azure Machine Learning. The full code is available in our samples repository.


 


Sample deployment with TensorFlow Serving


 


To deploy a TensorFlow model with TensorFlow Serving, first create a YAML file:


 

name: tfserving-endpoint
type: online
auth_mode: aml_token
traffic:
  tfserving: 100

deployments:
  - name: tfserving
    model:
      name: tfserving-mounted
      version: 1
      local_path: ./half_plus_two
    environment_variables:
      MODEL_BASE_PATH: /var/azureml-app/azureml-models/tfserving-mounted/1
      MODEL_NAME: half_plus_two
    environment:
      name: tfserving
      version: 1
      docker:
        image: docker.io/tensorflow/serving:latest
      inference_config:
        liveness_route:
          port: 8501
          path: /v1/models/half_plus_two
        readiness_route:
          port: 8501
          path: /v1/models/half_plus_two
        scoring_route:
          port: 8501
          path: /v1/models/half_plus_two:predict
    instance_type: Standard_F2s_v2
    scale_settings:
      scale_type: manual
      instance_count: 1
      min_instances: 1
      max_instances: 2

 


Then create your endpoint:


 

az ml endpoint create -f endpoint.yml

 


And that’s it! You now have a scalable TensorFlow Serving endpoint running on Azure ML-managed compute.


Next steps


Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.