DP-100 - Prepare a Model for Deployment (20-25%)
1. Model Training Scripts
A training script is a Python file that contains the logic for loading data, training a model, evaluating performance, and saving the trained model. Azure ML runs these scripts on remote compute targets.
1.1 Introduction to Model Training Scripts
A typical training script follows this structure:
import argparse
import pandas as pd
import mlflow
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score
# Parse arguments
parser = argparse.ArgumentParser()
parser.add_argument("--data", type=str, help="Path to input data")
parser.add_argument("--learning_rate", type=float, default=0.1)
parser.add_argument("--n_estimators", type=int, default=100)
args = parser.parse_args()
# Load data
df = pd.read_csv(args.data)
X = df.drop("target", axis=1)
y = df["target"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Train
model = GradientBoostingClassifier(
learning_rate=args.learning_rate,
n_estimators=args.n_estimators
)
model.fit(X_train, y_train)
# Evaluate and log
accuracy = accuracy_score(y_test, model.predict(X_test))
mlflow.log_metric("accuracy", accuracy)
mlflow.log_param("learning_rate", args.learning_rate)
mlflow.log_param("n_estimators", args.n_estimators)
mlflow.sklearn.log_model(model, "model")
1.2 Running a Training Script End-to-End
Submit the training script as a command job using the Azure ML SDK:
from azure.ai.ml import command, Input
job = command(
code="./src",
command="python train.py --data [null] --learning_rate 0.1 --n_estimators 100",
inputs={"training_data": Input(type="uri_file", path="azureml:diabetes-dataset:1")},
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
compute="my-compute-cluster",
experiment_name="training-experiment"
)
returned_job = ml_client.jobs.create_or_update(job)
ml_client.jobs.stream(returned_job.name)
2. Script Parameters and Compute Configuration
2.1 Configure Compute and Script Parameters
Script parameters let you configure training runs without modifying the script. Use argparse in your script and pass different values through the SDK:
--data – Path to the input dataset.
--learning_rate – Controls how fast the model learns.
--n_estimators – Number of trees in ensemble methods.
--max_depth – Maximum depth of each tree.
--regularization_rate – Controls complexity to prevent overfitting.
2.2 Cycling Through Script Parameters
You can use Azure ML sweep jobs to search across multiple parameter combinations:
from azure.ai.ml.sweep import Choice, Uniform
command_job_for_sweep = job(
learning_rate=Uniform(min_value=0.01, max_value=0.3),
n_estimators=Choice(values=[50, 100, 200, 300])
)
sweep_job = command_job_for_sweep.sweep(
sampling_algorithm="random",
primary_metric="accuracy",
goal="maximize"
)
sweep_job.set_limits(max_total_trials=20, max_concurrent_trials=4, timeout=7200)
returned_sweep = ml_client.jobs.create_or_update(sweep_job)
2.3 Testing Different Script Parameters
After a sweep completes, review the results in Azure ML Studio. The Trials tab shows metrics for each parameter combination. Select the best trial and register its model.
2.4 Configure Compute for a Job Run
Specify compute targets in your job configuration. You can use a compute cluster for scalable training:
from azure.ai.ml.entities import AmlCompute
cluster = AmlCompute(
name="training-cluster",
size="STANDARD_DS3_V2",
min_instances=0,
max_instances=4,
idle_time_before_scale_down=120
)
ml_client.compute.begin_create_or_update(cluster)
2.5 Adding Compute to an Environment
An environment defines the software dependencies for your training script. Azure ML provides curated environments, or you can create custom ones:
from azure.ai.ml.entities import Environment
custom_env = Environment(
name="custom-sklearn-env",
conda_file="./conda.yml",
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
description="Custom environment with scikit-learn and pandas"
)
ml_client.environments.create_or_update(custom_env)
2.6 Deleting a Compute Through Python SDK
ml_client.compute.begin_delete("training-cluster")
3. Pipelines
Azure ML pipelines allow you to chain multiple steps into a reproducible workflow. Each step can run on different compute targets and use different environments.
3.1 Introduction to Pipelines
A pipeline is a workflow of machine learning tasks that can be independently executed. Pipelines promote reusability, reproducibility, and automation.
Modularity – Break complex workflows into reusable steps.
Reproducibility – Track inputs, outputs, and parameters for every run.
Scalability – Run steps on different compute targets.
Automation – Schedule or trigger pipelines based on events.
3.2 Create a Prepare Data Step in Pipeline
from azure.ai.ml import command, Input, Output
from azure.ai.ml.dsl import pipeline
prepare_data_component = command(
name="prepare_data",
code="./src",
command="python prepare.py --input [null] --output [null]",
inputs={"raw_data": Input(type="uri_file")},
outputs={"prepared_data": Output(type="uri_folder")},
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
compute="training-cluster"
)
3.3 Create a Train Model Step in Pipeline
train_model_component = command(
name="train_model",
code="./src",
command="python train.py --data [null] --output [null]",
inputs={"training_data": Input(type="uri_folder")},
outputs={"model_output": Output(type="uri_folder")},
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
compute="training-cluster"
)
3.4 Create a Pipeline Run Script
@pipeline(default_compute="training-cluster")
def training_pipeline(raw_data):
prepare_step = prepare_data_component(raw_data=raw_data)
train_step = train_model_component(training_data=prepare_step.outputs.prepared_data)
return {"model": train_step.outputs.model_output}
pipeline_job = training_pipeline(
raw_data=Input(type="uri_file", path="azureml:diabetes-dataset:1")
)
returned_pipeline = ml_client.jobs.create_or_update(pipeline_job)
ml_client.jobs.stream(returned_pipeline.name)
3.5 Pass Data Between Steps in Pipeline
Data flows between steps through outputs and inputs. The output of one step becomes the input of the next. Azure ML handles serialization and storage automatically.
3.6 Run the Pipeline
Submit the pipeline using ml_client.jobs.create_or_update(). Monitor progress in Azure ML Studio under the Jobs tab.
3.7 Other Ways to Run the Pipeline
- Azure CLI – Use
az ml job create --file pipeline.yml - REST API – Submit pipeline jobs via HTTP requests
- Scheduled Runs – Configure time-based or event-based triggers
3.8 Publishing the Endpoint
Publish a pipeline as a pipeline endpoint to make it callable via REST API:
from azure.ai.ml.entities import PipelineJob
# After a successful pipeline run, publish it as an endpoint
published_pipeline = ml_client.jobs.create_or_update(
pipeline_job,
experiment_name="published-pipeline"
)
3.9 Create a Pipeline Endpoint and Call It
Once published, you can trigger a pipeline run via HTTP:
import requests
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
token = credential.get_token("https://ml.azure.com/.default").token
response = requests.post(
url=pipeline_endpoint_url,
headers={"Authorization": f"Bearer {token}",
"Content-Type": "application/json"},
json={"ExperimentName": "remote-trigger"}
)
3.10 Monitor Pipeline Runs
Monitor pipeline runs in Azure ML Studio. Each step shows status, duration, logs, and outputs. You can also monitor programmatically:
pipeline_run = ml_client.jobs.get(returned_pipeline.name)
print("Status:", pipeline_run.status)
← Back to DP-100 Preparation Topics