DP-100 - Design and Prepare a Machine Learning Solution (20-25%)
1. Determine the Appropriate Compute Specifications
Choosing the right compute resources is one of the first decisions you make when building a machine learning solution on Azure. The compute you select impacts training speed, cost, and the type of workloads you can run.
Compute Instance – A managed cloud workstation for data scientists. Runs Jupyter notebooks, VS Code, and RStudio. Ideal for development and experimentation.
Compute Cluster – A managed multi-node cluster that automatically scales. Used for training jobs and pipeline steps. Supports CPU and GPU VM sizes.
Inference Cluster (AKS) – Azure Kubernetes Service cluster for deploying models as real-time web services at production scale.
Attached Compute – Bring your own compute resources such as Azure VMs, Azure Databricks, or Azure Data Lake Analytics.
1.1 Selecting VM Sizes
Azure provides various VM families optimized for different workloads:
- General Purpose (D-series) – Balanced CPU-to-memory ratio. Good for exploration and small training jobs.
- Memory Optimized (E-series) – High memory-to-CPU ratio. Useful for large datasets that need to fit in memory.
- GPU (NC/ND-series) – NVIDIA GPUs for deep learning, computer vision, and NLP workloads.
- Compute Optimized (F-series) – High CPU-to-memory ratio for compute-bound tasks.
1.2 Model Deployment Requirements
When deploying a model, you must consider the compute target, container image, and resource requirements such as CPU cores and memory. Azure ML supports:
- Azure Container Instances (ACI) – Quick deployment for testing and low-scale workloads.
- Azure Kubernetes Service (AKS) – Production-grade deployment with auto-scaling and load balancing.
- Managed Online Endpoints – Simplified deployment experience with built-in monitoring and blue-green deployments.
1.3 Choosing a Development Approach
Azure ML provides multiple ways to build and train a model:
Azure ML Designer – Drag-and-drop UI for building ML pipelines without code. Ideal for exploration and quick prototyping.
Automated ML – Automatically iterates through algorithms and hyperparameters to find the best model. Supports classification, regression, and forecasting.
Python SDK (Notebooks) – Full programmatic control using Azure ML Python SDK v2. Best for custom code, advanced experimentation, and reproducible pipelines.
Azure ML CLI v2 – YAML-based configuration for creating and managing ML resources from the command line.
2. Create an Azure Machine Learning Workspace
The Azure Machine Learning workspace is the top-level resource that provides a centralized place to work with all the artifacts you create. It stores references to compute resources, data, experiments, models, and deployed endpoints.
2.1 Workspace Components
Azure Storage Account – Default datastore for storing datasets, model files, and experiment logs.
Azure Key Vault – Stores secrets such as connection strings and authentication keys used by compute targets and datastores.
Azure Container Registry – Stores Docker images for training environments and model deployments.
Application Insights – Monitors deployed web services, collecting telemetry for performance analysis.
2.2 How to Access Azure ML Tools
You can interact with your workspace through several interfaces:
- Azure ML Studio – Web-based UI at
ml.azure.comfor managing all workspace resources. - Python SDK v2 –
azure-ai-mllibrary for programmatic access from notebooks or scripts. - Azure CLI + ML Extension – Command-line access using
az mlcommands. - REST API – Direct HTTP calls for advanced automation scenarios.
- VS Code Extension – Azure ML extension for developing and submitting experiments from VS Code.
2.3 Create a Compute Instance
A compute instance is a managed VM pre-configured with ML tools. You can create one from Azure ML Studio or using the Python SDK:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import ComputeInstance
from azure.identity import DefaultAzureCredential
ml_client = MLClient(
DefaultAzureCredential(),
subscription_id="your-subscription-id",
resource_group_name="your-rg",
workspace_name="your-workspace"
)
ci = ComputeInstance(
name="my-compute-instance",
size="STANDARD_DS3_V2"
)
ml_client.compute.begin_create_or_update(ci)
2.4 Running Python SDK Import Statements
After creating a compute instance, you can open a Jupyter notebook and verify the SDK installation:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
ml_client = MLClient(credential, subscription_id, resource_group, workspace_name)
print("SDK connected to workspace:", ml_client.workspace_name)
2.5 Stopping a Compute Instance
To manage costs, always stop compute instances when not in use. You can enable idle shutdown or use a schedule:
ml_client.compute.begin_stop("my-compute-instance")
3. Create Azure Data Resources
Azure ML uses datastores and data assets to manage connections to your data and track data versions for reproducibility.
3.1 Create and Register a Datastore
A datastore is a reference to an Azure storage service. Azure ML comes with a default datastore (the workspace blob storage), but you can register additional datastores for Azure Blob Storage, Azure Data Lake Storage Gen2, Azure SQL Database, and more.
from azure.ai.ml.entities import AzureBlobDatastore, AccountKeyConfiguration
blob_datastore = AzureBlobDatastore(
name="my_blob_datastore",
account_name="mystorageaccount",
container_name="mycontainer",
credentials=AccountKeyConfiguration(
account_key="your-account-key"
)
)
ml_client.datastores.create_or_update(blob_datastore)
3.2 Transferring Files to a Datastore
You can upload local files or directories to a datastore using the Azure ML SDK or Azure Storage SDK:
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
# Upload a local folder to the default datastore
my_data = Data(
name="training-data",
path="./data/training/",
type=AssetTypes.URI_FOLDER,
description="Training dataset for the ML model"
)
ml_client.data.create_or_update(my_data)
3.3 Create a Data Asset
Data assets are versioned references to data stored in datastores or public URLs. They make it easy to share and track data across experiments.
URI File (uri_file) – Points to a single file (e.g., a CSV or Parquet file).
URI Folder (uri_folder) – Points to a folder containing multiple files.
MLTable (mltable) – Defines a tabular schema on top of one or more files, allowing column selection, type casting, and transformations.
3.4 Register a Data Asset Through SDK
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
my_data = Data(
name="diabetes-dataset",
version="1",
path="azureml://datastores/my_blob_datastore/paths/data/diabetes.csv",
type=AssetTypes.URI_FILE,
description="Diabetes CSV dataset"
)
ml_client.data.create_or_update(my_data)
3.5 Consume Data Assets Through SDK
Once registered, you can consume data assets in experiments and pipelines:
# Retrieve a data asset by name and version
data_asset = ml_client.data.get(name="diabetes-dataset", version="1")
print("Data asset URI:", data_asset.path)
# Use in a training script via Input
from azure.ai.ml import Input
training_input = Input(type=AssetTypes.URI_FILE, path=data_asset.id)
uri_file, uri_folder, and mltable.
← Back to DP-100 Preparation Topics