DP-100 - Deploy and Retrain a Model (10-15%)

1. Introduction to Deploying a Model

After training and validating a model, the next step is to deploy it so that applications can consume predictions. Azure ML provides managed online endpoints for real-time inference and batch endpoints for large-scale scoring.

Deployment Options

Managed Online Endpoints – Azure-managed infrastructure for real-time inference. Supports blue-green deployments and auto-scaling.

Kubernetes Online Endpoints – Deploy to your own AKS cluster for more control over infrastructure.

Batch Endpoints – Process large amounts of data asynchronously. Ideal for scoring datasets in bulk.

2. Create a Model to Be Deployed

Before deployment, register the trained model in your workspace. A registered model has a name, version, and path to the model artifacts:

from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

model = Model(
    path="./model/",
    name="diabetes-classifier",
    description="Gradient Boosting classifier for diabetes prediction",
    type=AssetTypes.CUSTOM_MODEL
)
registered_model = ml_client.models.create_or_update(model)
print(f"Registered: {registered_model.name}, version: {registered_model.version}")

You can also register a model directly from a training job run:

from azure.ai.ml.entities import Model

model = Model(
    path=f"azureml://jobs/{returned_job.name}/outputs/artifacts/paths/model/",
    name="diabetes-classifier",
    type=AssetTypes.MLFLOW_MODEL
)
ml_client.models.create_or_update(model)

3. Configure Model for Real-Time Deployment

3.1 Creating a Scoring Script

A scoring script defines how the model processes incoming requests. It must contain two functions: init() and run():

import json
import joblib
import numpy as np
import os

def init():
    global model
    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "model.pkl")
    model = joblib.load(model_path)

def run(raw_data):
    data = json.loads(raw_data)
    data = np.array(data["data"])
    predictions = model.predict(data)
    return predictions.tolist()

Scoring Script Functions

init() – Called once when the service starts. Load the model and any required resources here.

run(raw_data) – Called for each inference request. Parse the input, make predictions, and return results.

3.2 Removing the Dependent Variable

When preparing data for inference, ensure the target column (dependent variable) is excluded from the input features. The scoring script should only expect the feature columns that the model was trained on.

4. Deploy a Model to a Real-Time Endpoint

4.1 Create a Managed Online Endpoint

from azure.ai.ml.entities import ManagedOnlineEndpoint

endpoint = ManagedOnlineEndpoint(
    name="diabetes-endpoint",
    description="Real-time endpoint for diabetes prediction",
    auth_mode="key"
)
ml_client.online_endpoints.begin_create_or_update(endpoint)

4.2 Create a Deployment

from azure.ai.ml.entities import ManagedOnlineDeployment

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name="diabetes-endpoint",
    model="diabetes-classifier:1",
    code_configuration={
        "code": "./scoring/",
        "scoring_script": "score.py"
    },
    environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
    instance_type="Standard_DS2_v2",
    instance_count=1
)
ml_client.online_deployments.begin_create_or_update(blue_deployment)

# Route 100% traffic to blue deployment
endpoint.traffic = {"blue": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint)

5. Test a Real-Time Deployed Service

After deployment, test the endpoint using the SDK or REST API:

# Test using SDK
response = ml_client.online_endpoints.invoke(
    endpoint_name="diabetes-endpoint",
    request_file="./sample-request.json"
)
print(response)

Sample request JSON:

{
    "data": [[0.01991321, 0.05068012, 0.10480869, 0.07007254,
              -0.03596778, -0.0266789, -0.02499266, -0.00259226,
              0.00371174, 0.04034337]]
}

6. Consume the Deployed Model in an Endpoint

Applications consume the endpoint via REST API calls. Retrieve the endpoint URL and authentication key:

import requests
import json

endpoint = ml_client.online_endpoints.get("diabetes-endpoint")
scoring_uri = endpoint.scoring_uri
keys = ml_client.online_endpoints.get_keys("diabetes-endpoint")

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {keys.primary_key}"
}

data = {"data": [[0.01991321, 0.05068012, 0.10480869, 0.07007254,
                   -0.03596778, -0.0266789, -0.02499266, -0.00259226,
                   0.00371174, 0.04034337]]}

response = requests.post(scoring_uri, headers=headers, json=data)
print("Prediction:", response.json())

7. Make Modifications and Redeploy a Model

7.1 Blue-Green Deployment

Azure ML managed endpoints support blue-green deployments. Deploy a new version alongside the existing one and gradually shift traffic:

# Create a green deployment with updated model
green_deployment = ManagedOnlineDeployment(
    name="green",
    endpoint_name="diabetes-endpoint",
    model="diabetes-classifier:2",
    code_configuration={
        "code": "./scoring/",
        "scoring_script": "score.py"
    },
    environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
    instance_type="Standard_DS2_v2",
    instance_count=1
)
ml_client.online_deployments.begin_create_or_update(green_deployment)

# Shift traffic: 90% blue, 10% green
endpoint.traffic = {"blue": 90, "green": 10}
ml_client.online_endpoints.begin_create_or_update(endpoint)

# After validation, shift 100% to green
endpoint.traffic = {"blue": 0, "green": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint)

# Delete old blue deployment
ml_client.online_deployments.begin_delete(
    name="blue", endpoint_name="diabetes-endpoint"
)

7.2 Redeploying a Model

When you retrain a model with new data, register the new version, create a new deployment, test it, and then switch traffic. This ensures zero downtime and allows rollback if issues arise.

Exam Tip: Understand the complete deployment lifecycle: register model → create endpoint → create deployment → test → route traffic. Know the difference between managed online endpoints, Kubernetes endpoints, and batch endpoints. Be familiar with blue-green deployments and traffic routing.

← Back to DP-100 Preparation Topics

Search Tutorials