DP-100 - Deploy and Retrain a Model (10-15%)
1. Introduction to Deploying a Model
After training and validating a model, the next step is to deploy it so that applications can consume predictions. Azure ML provides managed online endpoints for real-time inference and batch endpoints for large-scale scoring.
Managed Online Endpoints – Azure-managed infrastructure for real-time inference. Supports blue-green deployments and auto-scaling.
Kubernetes Online Endpoints – Deploy to your own AKS cluster for more control over infrastructure.
Batch Endpoints – Process large amounts of data asynchronously. Ideal for scoring datasets in bulk.
2. Create a Model to Be Deployed
Before deployment, register the trained model in your workspace. A registered model has a name, version, and path to the model artifacts:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes
model = Model(
path="./model/",
name="diabetes-classifier",
description="Gradient Boosting classifier for diabetes prediction",
type=AssetTypes.CUSTOM_MODEL
)
registered_model = ml_client.models.create_or_update(model)
print(f"Registered: {registered_model.name}, version: {registered_model.version}")
You can also register a model directly from a training job run:
from azure.ai.ml.entities import Model
model = Model(
path=f"azureml://jobs/{returned_job.name}/outputs/artifacts/paths/model/",
name="diabetes-classifier",
type=AssetTypes.MLFLOW_MODEL
)
ml_client.models.create_or_update(model)
3. Configure Model for Real-Time Deployment
3.1 Creating a Scoring Script
A scoring script defines how the model processes incoming requests. It must contain two functions: init() and run():
import json
import joblib
import numpy as np
import os
def init():
global model
model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "model.pkl")
model = joblib.load(model_path)
def run(raw_data):
data = json.loads(raw_data)
data = np.array(data["data"])
predictions = model.predict(data)
return predictions.tolist()
init() – Called once when the service starts. Load the model and any required resources here.
run(raw_data) – Called for each inference request. Parse the input, make predictions, and return results.
3.2 Removing the Dependent Variable
When preparing data for inference, ensure the target column (dependent variable) is excluded from the input features. The scoring script should only expect the feature columns that the model was trained on.
4. Deploy a Model to a Real-Time Endpoint
4.1 Create a Managed Online Endpoint
from azure.ai.ml.entities import ManagedOnlineEndpoint
endpoint = ManagedOnlineEndpoint(
name="diabetes-endpoint",
description="Real-time endpoint for diabetes prediction",
auth_mode="key"
)
ml_client.online_endpoints.begin_create_or_update(endpoint)
4.2 Create a Deployment
from azure.ai.ml.entities import ManagedOnlineDeployment
blue_deployment = ManagedOnlineDeployment(
name="blue",
endpoint_name="diabetes-endpoint",
model="diabetes-classifier:1",
code_configuration={
"code": "./scoring/",
"scoring_script": "score.py"
},
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
instance_type="Standard_DS2_v2",
instance_count=1
)
ml_client.online_deployments.begin_create_or_update(blue_deployment)
# Route 100% traffic to blue deployment
endpoint.traffic = {"blue": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint)
5. Test a Real-Time Deployed Service
After deployment, test the endpoint using the SDK or REST API:
# Test using SDK
response = ml_client.online_endpoints.invoke(
endpoint_name="diabetes-endpoint",
request_file="./sample-request.json"
)
print(response)
Sample request JSON:
{
"data": [[0.01991321, 0.05068012, 0.10480869, 0.07007254,
-0.03596778, -0.0266789, -0.02499266, -0.00259226,
0.00371174, 0.04034337]]
}
6. Consume the Deployed Model in an Endpoint
Applications consume the endpoint via REST API calls. Retrieve the endpoint URL and authentication key:
import requests
import json
endpoint = ml_client.online_endpoints.get("diabetes-endpoint")
scoring_uri = endpoint.scoring_uri
keys = ml_client.online_endpoints.get_keys("diabetes-endpoint")
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {keys.primary_key}"
}
data = {"data": [[0.01991321, 0.05068012, 0.10480869, 0.07007254,
-0.03596778, -0.0266789, -0.02499266, -0.00259226,
0.00371174, 0.04034337]]}
response = requests.post(scoring_uri, headers=headers, json=data)
print("Prediction:", response.json())
7. Make Modifications and Redeploy a Model
7.1 Blue-Green Deployment
Azure ML managed endpoints support blue-green deployments. Deploy a new version alongside the existing one and gradually shift traffic:
# Create a green deployment with updated model
green_deployment = ManagedOnlineDeployment(
name="green",
endpoint_name="diabetes-endpoint",
model="diabetes-classifier:2",
code_configuration={
"code": "./scoring/",
"scoring_script": "score.py"
},
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
instance_type="Standard_DS2_v2",
instance_count=1
)
ml_client.online_deployments.begin_create_or_update(green_deployment)
# Shift traffic: 90% blue, 10% green
endpoint.traffic = {"blue": 90, "green": 10}
ml_client.online_endpoints.begin_create_or_update(endpoint)
# After validation, shift 100% to green
endpoint.traffic = {"blue": 0, "green": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint)
# Delete old blue deployment
ml_client.online_deployments.begin_delete(
name="blue", endpoint_name="diabetes-endpoint"
)
7.2 Redeploying a Model
When you retrain a model with new data, register the new version, create a new deployment, test it, and then switch traffic. This ensures zero downtime and allows rollback if issues arise.
← Back to DP-100 Preparation Topics