Search Tutorials


Top GCP Vertex AI & Machine Learning Interview Questions (2026) | JavaInUse

Top 20 GCP Vertex AI & Machine Learning Interview Questions


  1. What is Vertex AI?
  2. What is AutoML?
  3. What are Vertex AI Workbench?
  4. How do you train custom models?
  5. What are Vertex AI Pipelines?
  6. How do you deploy models?
  7. What is Feature Store?
  8. How do you use pre-built APIs?
  9. What is Model Registry?
  10. How do you implement batch predictions?
  11. What are Experiments and Metadata?
  12. How do you optimize training?
  13. What is Vertex AI Vector Search?
  14. How do you monitor models?
  15. What are custom containers?
  16. How do you implement MLOps?
  17. What is Generative AI on Vertex AI?
  18. How do you handle model versioning?
  19. What are training best practices?
  20. How do you design ML architecture?

Google Cloud Interview Questions

1. What is Vertex AI?

Vertex AI is Google Cloud's unified ML platform for building, deploying, and managing ML models at scale.

Vertex AI Components:
+-------------------------------------------------------------+
|                      Vertex AI                               |
+-------------------------------------------------------------+
|  +-----------------------------------------------------+   |
|  |              Build & Train                           |   |
|  |  +-- Workbench (Jupyter notebooks)                  |   |
|  |  +-- AutoML (no-code ML)                            |   |
|  |  +-- Custom Training                                |   |
|  |  +-- Pipelines (ML workflows)                       |   |
|  |  +-- Experiments (tracking)                         |   |
|  +-----------------------------------------------------+   |
|  +-----------------------------------------------------+   |
|  |              Manage & Deploy                         |   |
|  |  +-- Model Registry                                 |   |
|  |  +-- Feature Store                                  |   |
|  |  +-- Endpoints (online prediction)                  |   |
|  |  +-- Batch Prediction                               |   |
|  |  +-- Model Monitoring                               |   |
|  +-----------------------------------------------------+   |
|  +-----------------------------------------------------+   |
|  |              Foundation Models                       |   |
|  |  +-- Gemini (multimodal)                            |   |
|  |  +-- PaLM (text)                                    |   |
|  |  +-- Imagen (images)                                |   |
|  |  +-- Codey (code)                                   |   |
|  +-----------------------------------------------------+   |
+-------------------------------------------------------------+

# Enable Vertex AI
gcloud services enable aiplatform.googleapis.com

# Python SDK setup
from google.cloud import aiplatform

aiplatform.init(
    project='my-project',
    location='us-central1',
    staging_bucket='gs://my-bucket'
)

2. What is AutoML?

AutoML enables training high-quality models without ML expertise or coding.

AutoML Types:
+-- AutoML Tabular - Structured data
+-- AutoML Image - Classification, detection
+-- AutoML Text - NLP tasks
+-- AutoML Video - Video analysis
+-- AutoML Forecasting - Time series

AutoML Tabular Example:

from google.cloud import aiplatform

# Create dataset
dataset = aiplatform.TabularDataset.create(
    display_name='customer_churn',
    gcs_source='gs://bucket/data.csv'
)

# Train AutoML model
job = aiplatform.AutoMLTabularTrainingJob(
    display_name='churn_prediction',
    optimization_prediction_type='classification',
    optimization_objective='maximize-au-roc'
)

model = job.run(
    dataset=dataset,
    target_column='churn',
    training_fraction_split=0.8,
    validation_fraction_split=0.1,
    test_fraction_split=0.1,
    budget_milli_node_hours=1000,  # 1 hour
    model_display_name='churn_model'
)

AutoML Image Classification:

# Create image dataset
dataset = aiplatform.ImageDataset.create(
    display_name='product_images',
    gcs_source='gs://bucket/images/import.csv',
    import_schema_uri=aiplatform.schema.dataset.ioformat.image.single_label_classification
)

# Train model
job = aiplatform.AutoMLImageTrainingJob(
    display_name='product_classifier',
    prediction_type='classification',
    model_type='CLOUD'
)

model = job.run(
    dataset=dataset,
    model_display_name='product_model',
    training_fraction_split=0.8,
    validation_fraction_split=0.1,
    test_fraction_split=0.1,
    budget_milli_node_hours=8000
)

3. What are Vertex AI Workbench?

Workbench Types:
+-- Managed Notebooks - Fully managed Jupyter
+-- User-managed Notebooks - More control
+-- Workbench Instances - Latest offering

Features:
+-- Pre-installed ML libraries
+-- GPU/TPU support
+-- Git integration
+-- BigQuery connector
+-- Scheduled executions
+-- Collaboration features

# Create managed notebook
gcloud notebooks instances create my-notebook \
    --location=us-central1-a \
    --machine-type=n1-standard-4 \
    --accelerator-type=NVIDIA_TESLA_T4 \
    --accelerator-core-count=1 \
    --install-gpu-driver

# Create with specific image
gcloud notebooks instances create ml-notebook \
    --location=us-central1-a \
    --machine-type=n1-standard-8 \
    --vm-image-project=deeplearning-platform-release \
    --vm-image-family=tf-latest-gpu

# Schedule notebook execution
gcloud notebooks executions create \
    --display-name="Daily Training" \
    --execution-template=execution-template.yaml \
    --input-notebook-file=gs://bucket/notebooks/train.ipynb \
    --output-notebook-folder=gs://bucket/outputs/ \
    --params='{"learning_rate": 0.01}' \
    --service-account=ml-sa@project.iam.gserviceaccount.com

Notebook Best Practices:
+-- Use parameterized notebooks
+-- Version control notebooks
+-- Separate experimentation from production
+-- Use idle shutdown
+-- Tag resources for cost tracking

# Terraform
resource "google_notebooks_instance" "notebook" {
  name         = "ml-notebook"
  location     = "us-central1-a"
  machine_type = "n1-standard-4"

  vm_image {
    project      = "deeplearning-platform-release"
    image_family = "tf-latest-gpu"
  }

  install_gpu_driver = true
  
  accelerator_config {
    type       = "NVIDIA_TESLA_T4"
    core_count = 1
  }
}

4. How do you train custom models?

Custom Training Options:
+-- Pre-built containers
+-- Custom containers
+-- Local training
+-- Distributed training

# Pre-built container training
from google.cloud import aiplatform

job = aiplatform.CustomTrainingJob(
    display_name='custom_training',
    script_path='train.py',
    container_uri='us-docker.pkg.dev/vertex-ai/training/tf-gpu.2-12:latest',
    requirements=['pandas', 'scikit-learn'],
    model_serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/tf2-gpu.2-12:latest'
)

model = job.run(
    replica_count=1,
    machine_type='n1-standard-8',
    accelerator_type='NVIDIA_TESLA_V100',
    accelerator_count=1,
    args=['--epochs=10', '--batch_size=32'],
    environment_variables={'MY_VAR': 'value'},
    base_output_dir='gs://bucket/output'
)

# train.py
import argparse
import tensorflow as tf
from google.cloud import storage

def train(args):
    # Load data
    train_data = tf.data.TFRecordDataset('gs://bucket/train.tfrecord')
    
    # Build model
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    
    model.compile(
        optimizer='adam',
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    
    # Train
    model.fit(train_data, epochs=args.epochs, batch_size=args.batch_size)
    
    # Save model
    model.save(f'{args.model_dir}/model')

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--epochs', type=int, default=10)
    parser.add_argument('--batch_size', type=int, default=32)
    parser.add_argument('--model_dir', default=os.environ.get('AIP_MODEL_DIR'))
    args = parser.parse_args()
    train(args)

# Distributed training
job = aiplatform.CustomTrainingJob(...)

model = job.run(
    replica_count=4,
    machine_type='n1-standard-16',
    accelerator_type='NVIDIA_TESLA_V100',
    accelerator_count=2,
    reduction_server_replica_count=1,
    reduction_server_machine_type='n1-highcpu-16'
)

5. What are Vertex AI Pipelines?

Pipelines:
+-- Orchestrate ML workflows
+-- Based on Kubeflow Pipelines
+-- Reusable components
+-- Automatic artifact tracking
+-- Integration with Vertex services

Pipeline Example:

from kfp.v2 import dsl
from kfp.v2.dsl import component, Output, Input, Dataset, Model, Metrics
from google.cloud import aiplatform

@component(
    packages_to_install=['pandas', 'scikit-learn'],
    base_image='python:3.9'
)
def preprocess_data(
    input_path: str,
    output_dataset: Output[Dataset]
):
    import pandas as pd
    from sklearn.preprocessing import StandardScaler
    
    df = pd.read_csv(input_path)
    scaler = StandardScaler()
    df_scaled = pd.DataFrame(scaler.fit_transform(df))
    df_scaled.to_csv(output_dataset.path, index=False)

@component(
    packages_to_install=['scikit-learn', 'pandas'],
    base_image='python:3.9'
)
def train_model(
    dataset: Input[Dataset],
    model: Output[Model],
    metrics: Output[Metrics]
):
    import pandas as pd
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.model_selection import train_test_split
    import pickle
    
    df = pd.read_csv(dataset.path)
    X = df.drop('target', axis=1)
    y = df['target']
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
    
    clf = RandomForestClassifier()
    clf.fit(X_train, y_train)
    
    accuracy = clf.score(X_test, y_test)
    metrics.log_metric('accuracy', accuracy)
    
    with open(model.path, 'wb') as f:
        pickle.dump(clf, f)

@dsl.pipeline(
    name='ml-pipeline',
    description='End-to-end ML pipeline'
)
def ml_pipeline(input_path: str):
    preprocess_task = preprocess_data(input_path=input_path)
    train_task = train_model(dataset=preprocess_task.outputs['output_dataset'])

# Compile and run
from kfp.v2 import compiler
compiler.Compiler().compile(ml_pipeline, 'pipeline.json')

aiplatform.init(project='my-project', location='us-central1')
job = aiplatform.PipelineJob(
    display_name='my-pipeline',
    template_path='pipeline.json',
    parameter_values={'input_path': 'gs://bucket/data.csv'}
)
job.run()





6. How do you deploy models?

Deployment Options:
+-- Online Prediction (Endpoints)
+-- Batch Prediction
+-- Private Endpoints
+-- Multi-model Endpoints

Online Endpoint Deployment:

from google.cloud import aiplatform

# Deploy to endpoint
endpoint = model.deploy(
    machine_type='n1-standard-4',
    min_replica_count=1,
    max_replica_count=5,
    accelerator_type='NVIDIA_TESLA_T4',
    accelerator_count=1,
    traffic_split={'0': 100},
    service_account='prediction-sa@project.iam.gserviceaccount.com'
)

# Or deploy to existing endpoint
endpoint = aiplatform.Endpoint('projects/my-project/locations/us-central1/endpoints/123')
endpoint.deploy(
    model=model,
    deployed_model_display_name='v2',
    traffic_percentage=10,  # Canary deployment
    machine_type='n1-standard-4'
)

# Make prediction
instances = [
    {'feature1': 1.0, 'feature2': 2.0},
    {'feature1': 3.0, 'feature2': 4.0}
]
predictions = endpoint.predict(instances=instances)

# gcloud deployment
gcloud ai endpoints create \
    --display-name=my-endpoint \
    --region=us-central1

gcloud ai endpoints deploy-model ENDPOINT_ID \
    --model=MODEL_ID \
    --display-name=v1 \
    --machine-type=n1-standard-4 \
    --min-replica-count=1 \
    --max-replica-count=5 \
    --region=us-central1

# Private Endpoint
endpoint = aiplatform.Endpoint.create(
    display_name='private-endpoint',
    network='projects/123/global/networks/my-vpc'
)

# Multi-model endpoint (traffic split)
endpoint.deploy(model=model_v1, traffic_percentage=90)
endpoint.deploy(model=model_v2, traffic_percentage=10)

7. What is Feature Store?

Feature Store is a centralized repository to organize, store, and serve ML features.

Feature Store Architecture:
+-------------------------------------------------------------+
|                    Feature Store                             |
+-------------------------------------------------------------+
|  +-----------------------------------------------------+   |
|  |                 Feature Groups                       |   |
|  |  +-- user_features                                  |   |
|  |  |   +-- user_id (entity)                          |   |
|  |  |   +-- age                                       |   |
|  |  |   +-- total_purchases                           |   |
|  |  |   +-- avg_session_duration                      |   |
|  |  |                                                  |   |
|  |  +-- product_features                               |   |
|  |      +-- product_id (entity)                       |   |
|  |      +-- category                                  |   |
|  |      +-- price                                     |   |
|  |      +-- avg_rating                                |   |
|  +-----------------------------------------------------+   |
|                          |                                  |
|  +-----------------------------------------------------+   |
|  |              Online Store (low latency)              |   |
|  |              Offline Store (batch)                   |   |
|  +-----------------------------------------------------+   |
+-------------------------------------------------------------+

from google.cloud import aiplatform

# Create Feature Store
fs = aiplatform.Featurestore.create(
    featurestore_id='my_featurestore',
    online_serving_config=aiplatform.gca_featurestore.Featurestore.OnlineServingConfig(
        fixed_node_count=1
    )
)

# Create Entity Type
user_entity = fs.create_entity_type(
    entity_type_id='users',
    description='User features'
)

# Create Features
user_entity.create_feature(
    feature_id='age',
    value_type='INT64'
)
user_entity.create_feature(
    feature_id='total_purchases',
    value_type='DOUBLE'
)

# Ingest features from BigQuery
user_entity.ingest_from_bq(
    feature_ids=['age', 'total_purchases'],
    feature_time='timestamp',
    bq_source_uri='bq://project.dataset.user_features',
    entity_id_field='user_id'
)

# Online serving
features = fs.batch_serve_to_df(
    bq_destination_output_uri='bq://project.dataset.output',
    serving_feature_ids={
        'users': ['age', 'total_purchases'],
        'products': ['category', 'price']
    },
    read_instances_uri='bq://project.dataset.instances'
)

8. How do you use pre-built APIs?

Pre-built AI APIs:

# Vision API
from google.cloud import vision

client = vision.ImageAnnotatorClient()

image = vision.Image()
image.source.image_uri = 'gs://bucket/image.jpg'

# Label detection
response = client.label_detection(image=image)
for label in response.label_annotations:
    print(f'{label.description}: {label.score}')

# OCR
response = client.document_text_detection(image=image)
print(response.full_text_annotation.text)

# Natural Language API
from google.cloud import language_v1

client = language_v1.LanguageServiceClient()

document = language_v1.Document(
    content='Google Cloud is amazing for ML!',
    type_=language_v1.Document.Type.PLAIN_TEXT
)

# Sentiment analysis
sentiment = client.analyze_sentiment(document=document).document_sentiment
print(f'Score: {sentiment.score}, Magnitude: {sentiment.magnitude}')

# Entity extraction
entities = client.analyze_entities(document=document).entities
for entity in entities:
    print(f'{entity.name}: {entity.type_}')

# Speech-to-Text
from google.cloud import speech

client = speech.SpeechClient()

audio = speech.RecognitionAudio(uri='gs://bucket/audio.wav')
config = speech.RecognitionConfig(
    encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
    language_code='en-US',
    enable_automatic_punctuation=True
)

response = client.recognize(config=config, audio=audio)
for result in response.results:
    print(result.alternatives[0].transcript)

# Translation
from google.cloud import translate_v2

client = translate_v2.Client()
result = client.translate('Hello, world!', target_language='es')
print(result['translatedText'])

9. What is Model Registry?

Model Registry:
+-- Central repository for models
+-- Version management
+-- Model metadata
+-- Lineage tracking
+-- Deployment management

from google.cloud import aiplatform

# Upload model to registry
model = aiplatform.Model.upload(
    display_name='my_model',
    artifact_uri='gs://bucket/model/',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest',
    labels={'team': 'data-science', 'project': 'churn'},
    description='Customer churn prediction model v1'
)

# List models
models = aiplatform.Model.list(filter='display_name="my_model"')

# Get model versions
model = aiplatform.Model('projects/my-project/locations/us-central1/models/123')
print(f'Model: {model.display_name}')
print(f'Version: {model.version_id}')
print(f'Created: {model.create_time}')

# Create new version
new_model = aiplatform.Model.upload(
    display_name='my_model',
    parent_model=model.resource_name,
    artifact_uri='gs://bucket/model_v2/',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest'
)

# Set default version
model.set_version_aliases(['default', 'production'])

# Model evaluation
model.list_model_evaluations()

# gcloud commands
gcloud ai models list --region=us-central1

gcloud ai models describe MODEL_ID \
    --region=us-central1

gcloud ai models upload \
    --region=us-central1 \
    --display-name=my-model \
    --artifact-uri=gs://bucket/model \
    --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest

10. How do you implement batch predictions?

Batch Prediction:

from google.cloud import aiplatform

# Create batch prediction job
batch_job = model.batch_predict(
    job_display_name='batch_prediction_job',
    gcs_source='gs://bucket/input/instances.jsonl',
    gcs_destination_prefix='gs://bucket/output/',
    machine_type='n1-standard-4',
    accelerator_type='NVIDIA_TESLA_T4',
    accelerator_count=1,
    starting_replica_count=1,
    max_replica_count=10,
    sync=False
)

# Wait for completion
batch_job.wait()

# BigQuery input/output
batch_job = model.batch_predict(
    job_display_name='bq_batch_prediction',
    bigquery_source='bq://project.dataset.input_table',
    bigquery_destination_prefix='bq://project.dataset.output',
    machine_type='n1-standard-8'
)

# Input format (JSONL)
# {"feature1": 1.0, "feature2": "value"}
# {"feature1": 2.0, "feature2": "other"}

# gcloud batch prediction
gcloud ai batch-prediction-jobs create \
    --region=us-central1 \
    --display-name=batch_job \
    --model=MODEL_ID \
    --input-format=jsonl \
    --output-format=jsonl \
    --gcs-source-uris=gs://bucket/input/*.jsonl \
    --gcs-destination-output-uri-prefix=gs://bucket/output/ \
    --machine-type=n1-standard-4

# Custom batch prediction with Dataflow
from apache_beam import Pipeline
from apache_beam.ml.inference.base import RunInference
from apache_beam.ml.inference.vertex_ai_inference import VertexAIModelHandlerJSON

pipeline = Pipeline()

(pipeline
    | 'Read' >> beam.io.ReadFromBigQuery(query='SELECT * FROM table')
    | 'Predict' >> RunInference(
        model_handler=VertexAIModelHandlerJSON(
            endpoint_id='123',
            project='my-project',
            location='us-central1'
        )
    )
    | 'Write' >> beam.io.WriteToBigQuery('project:dataset.output')
)

pipeline.run()

11. What are Experiments and Metadata?

Experiments and ML Metadata:

from google.cloud import aiplatform

# Initialize with experiment
aiplatform.init(
    project='my-project',
    location='us-central1',
    experiment='my_experiment'
)

# Start experiment run
with aiplatform.start_run('run_001') as run:
    # Log parameters
    run.log_params({
        'learning_rate': 0.01,
        'epochs': 10,
        'batch_size': 32
    })
    
    # Train model
    for epoch in range(10):
        loss = train_epoch()
        accuracy = evaluate()
        
        # Log metrics
        run.log_metrics({
            'loss': loss,
            'accuracy': accuracy
        })
    
    # Log artifacts
    run.log_model(model, 'trained_model')

# Compare experiments
from google.cloud.aiplatform.metadata import experiment_resources

experiment = aiplatform.Experiment('my_experiment')
runs = experiment.get_experiment_df()
print(runs[['run_name', 'param.learning_rate', 'metric.accuracy']])

# Tensorboard integration
tensorboard = aiplatform.Tensorboard.create(display_name='my_tensorboard')

job = aiplatform.CustomTrainingJob(
    display_name='training_with_tensorboard',
    script_path='train.py',
    container_uri='us-docker.pkg.dev/vertex-ai/training/tf-gpu.2-12:latest'
)

model = job.run(
    args=['--epochs=10'],
    tensorboard=tensorboard.resource_name,
    service_account='ml-sa@project.iam.gserviceaccount.com'
)

# ML Metadata query
from google.cloud.aiplatform.metadata import metadata

# Get all artifacts
artifacts = metadata.Artifact.list(
    filter='state="LIVE" AND schema_title="system.Model"'
)

# Track lineage
execution = metadata.Execution.create(
    schema_title='system.ContainerExecution',
    display_name='training_execution'
)
execution.assign_input_artifacts([input_dataset])
execution.assign_output_artifacts([output_model])

12. How do you optimize training?

Training Optimization:

# 1. Hyperparameter Tuning
from google.cloud import aiplatform
from google.cloud.aiplatform import hyperparameter_tuning as hpt

job = aiplatform.CustomJob(
    display_name='hpt_training',
    worker_pool_specs=[{
        'machine_spec': {'machine_type': 'n1-standard-4'},
        'replica_count': 1,
        'python_package_spec': {
            'executor_image_uri': 'us-docker.pkg.dev/vertex-ai/training/tf-gpu.2-12:latest',
            'package_uris': ['gs://bucket/trainer-0.1.tar.gz'],
            'python_module': 'trainer.task'
        }
    }]
)

hp_job = aiplatform.HyperparameterTuningJob(
    display_name='hp_tuning',
    custom_job=job,
    metric_spec={'accuracy': 'maximize'},
    parameter_spec={
        'learning_rate': hpt.DoubleParameterSpec(min=0.001, max=0.1, scale='log'),
        'batch_size': hpt.DiscreteParameterSpec(values=[32, 64, 128], scale='linear'),
        'num_layers': hpt.IntegerParameterSpec(min=2, max=10, scale='linear')
    },
    max_trial_count=20,
    parallel_trial_count=5,
    search_algorithm='RANDOM_SEARCH'
)

hp_job.run()

# 2. Distributed Training
job.run(
    replica_count=4,
    machine_type='n1-standard-16',
    accelerator_type='NVIDIA_TESLA_V100',
    accelerator_count=2,
    reduction_server_replica_count=1,
    reduction_server_machine_type='n1-highcpu-16'
)

# 3. TPU Training
job.run(
    replica_count=1,
    machine_type='cloud-tpu',
    accelerator_type='TPU_V3',
    accelerator_count=8
)

# 4. Data Pipeline Optimization
import tensorflow as tf

dataset = tf.data.TFRecordDataset(files)
dataset = dataset.prefetch(tf.data.AUTOTUNE)
dataset = dataset.map(parse_fn, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.batch(batch_size)
dataset = dataset.cache()

# 5. Mixed Precision Training
policy = tf.keras.mixed_precision.Policy('mixed_float16')
tf.keras.mixed_precision.set_global_policy(policy)

13. What is Vertex AI Vector Search?

Vector Search (Matching Engine):
+-- High-performance vector similarity search
+-- Billion-scale embedding search
+-- Real-time updates
+-- Low latency (< 5ms)
+-- Approximate nearest neighbor

from google.cloud import aiplatform

# Create index
my_index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
    display_name='my_index',
    contents_delta_uri='gs://bucket/embeddings/',
    dimensions=768,
    approximate_neighbors_count=100,
    distance_measure_type='DOT_PRODUCT_DISTANCE',
    leaf_node_embedding_count=1000,
    leaf_nodes_to_search_percent=10
)

# Create index endpoint
my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
    display_name='my_endpoint',
    network='projects/123/global/networks/my-vpc',
    public_endpoint_enabled=True
)

# Deploy index to endpoint
my_index_endpoint.deploy_index(
    index=my_index,
    deployed_index_id='deployed_index'
)

# Query index
response = my_index_endpoint.find_neighbors(
    deployed_index_id='deployed_index',
    queries=[[0.1, 0.2, 0.3, ...]],  # Query embedding
    num_neighbors=10
)

for neighbor in response[0]:
    print(f'ID: {neighbor.id}, Distance: {neighbor.distance}')

# Update index with new embeddings
my_index.update_embeddings(
    contents_delta_uri='gs://bucket/new_embeddings/',
    is_complete_overwrite=False
)

Use Cases:
+-- Semantic search
+-- Recommendation systems
+-- Image similarity
+-- Anomaly detection
+-- RAG (Retrieval Augmented Generation)

14. How do you monitor models?

Model Monitoring:

from google.cloud import aiplatform

# Create monitoring job
monitoring_job = aiplatform.ModelDeploymentMonitoringJob.create(
    display_name='model_monitoring',
    endpoint=endpoint,
    logging_sampling_strategy=aiplatform.gca.model_deployment_monitoring_job.SamplingStrategy(
        random_sample_config=aiplatform.gca.model_deployment_monitoring_job.SamplingStrategy.RandomSampleConfig(
            sample_rate=0.1
        )
    ),
    model_deployment_monitoring_objective_configs=[
        aiplatform.gca.model_deployment_monitoring_job.ModelDeploymentMonitoringObjectiveConfig(
            deployed_model_id=deployed_model_id,
            objective_config=aiplatform.gca.model_deployment_monitoring_job.ModelMonitoringObjectiveConfig(
                training_dataset=aiplatform.gca.model_deployment_monitoring_job.ModelMonitoringObjectiveConfig.TrainingDataset(
                    bigquery_source=aiplatform.gca.io.BigQuerySource(
                        input_uri='bq://project.dataset.training_data'
                    ),
                    target_field='label'
                ),
                training_prediction_skew_detection_config=aiplatform.gca.model_deployment_monitoring_job.ModelMonitoringObjectiveConfig.TrainingPredictionSkewDetectionConfig(
                    skew_thresholds={'feature1': aiplatform.gca.model_deployment_monitoring_job.ThresholdConfig(value=0.1)}
                ),
                prediction_drift_detection_config=aiplatform.gca.model_deployment_monitoring_job.ModelMonitoringObjectiveConfig.PredictionDriftDetectionConfig(
                    drift_thresholds={'feature1': aiplatform.gca.model_deployment_monitoring_job.ThresholdConfig(value=0.1)}
                )
            )
        )
    ],
    model_deployment_monitoring_schedule_config=aiplatform.gca.model_deployment_monitoring_job.ModelDeploymentMonitoringScheduleConfig(
        monitor_interval={'seconds': 3600}  # Every hour
    ),
    alert_config=aiplatform.gca.model_deployment_monitoring_job.ModelMonitoringAlertConfig(
        email_alert_config=aiplatform.gca.model_deployment_monitoring_job.ModelMonitoringAlertConfig.EmailAlertConfig(
            user_emails=['team@company.com']
        )
    )
)

Monitoring Types:
+-- Feature Skew - Training vs serving data drift
+-- Prediction Drift - Distribution changes over time
+-- Feature Attribution - Explainability changes
+-- Performance - Latency, error rates

# Cloud Monitoring metrics
gcloud monitoring dashboards create --config-from-file=dashboard.json

# Custom metrics
from google.cloud import monitoring_v3

client = monitoring_v3.MetricServiceClient()
series = monitoring_v3.TimeSeries()
series.metric.type = 'custom.googleapis.com/ml/prediction_confidence'
...

15. What are custom containers?

Custom Containers:

# Training Container
# Dockerfile
FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY trainer/ ./trainer/

ENTRYPOINT ["python", "-m", "trainer.task"]

# Build and push
docker build -t gcr.io/my-project/trainer:v1 .
docker push gcr.io/my-project/trainer:v1

# Use custom container
job = aiplatform.CustomContainerTrainingJob(
    display_name='custom_training',
    container_uri='gcr.io/my-project/trainer:v1',
    command=['python', '-m', 'trainer.task'],
    model_serving_container_image_uri='gcr.io/my-project/serving:v1'
)

# Prediction Container
# Dockerfile
FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY app.py .

ENV AIP_HTTP_PORT=8080
ENV AIP_HEALTH_ROUTE=/health
ENV AIP_PREDICT_ROUTE=/predict

CMD ["python", "app.py"]

# app.py
from flask import Flask, request, jsonify
import pickle
import os

app = Flask(__name__)

# Load model
model_path = os.environ.get('AIP_STORAGE_URI', '/model')
with open(f'{model_path}/model.pkl', 'rb') as f:
    model = pickle.load(f)

@app.route('/health')
def health():
    return 'OK'

@app.route('/predict', methods=['POST'])
def predict():
    instances = request.json['instances']
    predictions = model.predict(instances)
    return jsonify({'predictions': predictions.tolist()})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=int(os.environ.get('AIP_HTTP_PORT', 8080)))

# Deploy custom serving container
model = aiplatform.Model.upload(
    display_name='custom_model',
    artifact_uri='gs://bucket/model/',
    serving_container_image_uri='gcr.io/my-project/serving:v1',
    serving_container_predict_route='/predict',
    serving_container_health_route='/health'
)

16. How do you implement MLOps?

MLOps Pipeline Architecture:
+-------------------------------------------------------------+
|                     MLOps Pipeline                           |
+-------------------------------------------------------------+
|  +---------+  +---------+  +---------+  +---------+       |
|  | Source  |->| Build   |->| Test    |->| Deploy  |       |
|  | Control |  | Container| | Model   |  | Staging |       |
|  +---------+  +---------+  +---------+  +----+----+       |
|                                               |             |
|                                          +----v----+       |
|                                          | Deploy  |       |
|                                          |  Prod   |       |
|                                          +----+----+       |
|                                               |             |
|                                          +----v----+       |
|                                          | Monitor |       |
|                                          +---------+       |
+-------------------------------------------------------------+

CI/CD with Cloud Build:

# cloudbuild.yaml
steps:
  # Build training container
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'gcr.io/$PROJECT_ID/trainer:$COMMIT_SHA', '.']
  
  # Push container
  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'gcr.io/$PROJECT_ID/trainer:$COMMIT_SHA']
  
  # Run unit tests
  - name: 'python:3.9'
    entrypoint: 'bash'
    args:
      - '-c'
      - |
        pip install -r requirements-test.txt
        pytest tests/
  
  # Trigger training pipeline
  - name: 'gcr.io/cloud-builders/gcloud'
    args:
      - 'ai'
      - 'pipelines'
      - 'run'
      - '--display-name=training-$COMMIT_SHA'
      - '--template-path=gs://bucket/pipeline.json'
      - '--parameter-values=image_uri=gcr.io/$PROJECT_ID/trainer:$COMMIT_SHA'

# Automated retraining trigger
from google.cloud import scheduler_v1
from google.cloud import pubsub_v1

# Create scheduled job for retraining
scheduler_client = scheduler_v1.CloudSchedulerClient()
job = scheduler_client.create_job(
    parent=f'projects/my-project/locations/us-central1',
    job={
        'name': 'retrain-weekly',
        'schedule': '0 0 * * 0',  # Every Sunday
        'pubsub_target': {
            'topic_name': 'projects/my-project/topics/retrain-trigger',
            'data': b'{"pipeline": "training-pipeline"}'
        }
    }
)

# Model approval workflow
def on_training_complete(model_metrics):
    if model_metrics['accuracy'] > threshold:
        # Auto-deploy to staging
        deploy_to_staging(model)
        
        # Notify for production approval
        create_approval_request(model)

17. What is Generative AI on Vertex AI?

Generative AI Services:
+-- Gemini - Multimodal (text, images, video)
+-- PaLM 2 - Text generation
+-- Imagen - Image generation
+-- Codey - Code generation
+-- Embeddings - Text embeddings

from vertexai.generative_models import GenerativeModel, Part

# Initialize Vertex AI
import vertexai
vertexai.init(project='my-project', location='us-central1')

# Gemini
model = GenerativeModel('gemini-pro')

# Text generation
response = model.generate_content('Explain machine learning in simple terms.')
print(response.text)

# Multimodal (with image)
model = GenerativeModel('gemini-pro-vision')
image = Part.from_uri('gs://bucket/image.jpg', mime_type='image/jpeg')
response = model.generate_content(['Describe this image:', image])

# Chat
chat = model.start_chat()
response = chat.send_message('Hello!')
response = chat.send_message('Tell me more about that.')

# PaLM for embeddings
from vertexai.language_models import TextEmbeddingModel

model = TextEmbeddingModel.from_pretrained('textembedding-gecko')
embeddings = model.get_embeddings(['Hello world', 'Machine learning'])

for embedding in embeddings:
    print(len(embedding.values))  # 768 dimensions

# Tuning foundation models
from vertexai.language_models import TextGenerationModel

model = TextGenerationModel.from_pretrained('text-bison')
tuning_job = model.tune_model(
    training_data='gs://bucket/training_data.jsonl',
    train_steps=100,
    tuned_model_location='us-central1'
)

# Grounding with Google Search
response = model.predict(
    'What is the latest news about AI?',
    grounding_source=GroundingSource.google_search()
)

# RAG with Vertex AI Search
from google.cloud import discoveryengine_v1

client = discoveryengine_v1.SearchServiceClient()
response = client.search(
    request={
        'serving_config': f'projects/{project}/locations/global/collections/default_collection/dataStores/{data_store}/servingConfigs/default_config',
        'query': 'What is our return policy?'
    }
)





18. How do you handle model versioning?

Model Versioning:

from google.cloud import aiplatform

# Upload with parent model (creates version)
model_v1 = aiplatform.Model.upload(
    display_name='my_model',
    artifact_uri='gs://bucket/model_v1/',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest'
)

# Upload v2 as new version
model_v2 = aiplatform.Model.upload(
    display_name='my_model',
    parent_model=model_v1.resource_name,
    artifact_uri='gs://bucket/model_v2/',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest',
    version_aliases=['candidate']
)

# List versions
versions = aiplatform.Model.list(
    filter=f'display_name="my_model"'
)

# Set version aliases
model_v2.add_version_aliases(['production', 'latest'])
model_v1.delete_version_aliases(['production'])

# Deploy specific version
endpoint.deploy(
    model=f'{model_resource_name}@{version_id}',
    deployed_model_display_name='v2',
    traffic_percentage=100
)

# A/B testing with traffic split
endpoint.deploy(model_v1, traffic_percentage=90)
endpoint.deploy(model_v2, traffic_percentage=10)

# Promote v2 to 100%
endpoint.undeploy(deployed_model_id=v1_deployed_id)
endpoint.update(
    traffic_split={v2_deployed_id: 100}
)

# Delete old version
model_v1.delete(sync=True)

# gcloud versioning
gcloud ai models upload \
    --region=us-central1 \
    --display-name=my_model \
    --parent-model=projects/my-project/locations/us-central1/models/123 \
    --artifact-uri=gs://bucket/model_v2 \
    --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest \
    --version-aliases=candidate

19. What are training best practices?

Training Best Practices:

1. Data Management
+-- Use BigQuery for large datasets
+-- Store in TFRecord for TensorFlow
+-- Version datasets
+-- Validate data quality

# Data validation
from tensorflow_data_validation import validate_statistics
stats = generate_statistics(dataset)
anomalies = validate_statistics(stats, schema)

2. Reproducibility
+-- Fix random seeds
+-- Version code and data
+-- Log all parameters
+-- Use containers

# Reproducibility
import random
import numpy as np
import tensorflow as tf

SEED = 42
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)

3. Efficient Training
+-- Use GPUs/TPUs appropriately
+-- Mixed precision training
+-- Data pipeline optimization
+-- Distributed training for large models

# Efficient data loading
dataset = tf.data.TFRecordDataset(files)
dataset = (dataset
    .shuffle(10000)
    .map(parse_fn, num_parallel_calls=tf.data.AUTOTUNE)
    .batch(batch_size)
    .prefetch(tf.data.AUTOTUNE)
)

4. Checkpointing
# Save checkpoints
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath='gs://bucket/checkpoints/model-{epoch:02d}',
    save_freq='epoch'
)

5. Cost Optimization
+-- Use preemptible/spot VMs
+-- Right-size instances
+-- Stop idle notebooks
+-- Monitor usage

# Preemptible training
job.run(
    machine_type='n1-standard-8',
    accelerator_type='NVIDIA_TESLA_T4',
    boot_disk_type='pd-ssd',
    boot_disk_size_gb=100,
    training_type='PREEMPTIBLE'  # Or 'SPOT'
)

6. Experiment Tracking
+-- Log all experiments
+-- Track metrics and parameters
+-- Compare runs
+-- Document findings

20. How do you design ML architecture?

ML Architecture Patterns:

Real-time Prediction:
+-------------------------------------------------------------+
|                                                              |
|  +-------+  +---------+  +----------+  +-------------+    |
|  |Client |->| API     |->| Feature  |->| Endpoint    |    |
|  |       |  | Gateway |  | Store    |  | (Prediction)|    |
|  +-------+  +---------+  +----------+  +-------------+    |
|                              |                              |
|                         +----v----+                        |
|                         |Feature  |                        |
|                         |Pipeline |                        |
|                         +---------+                        |
+-------------------------------------------------------------+

Batch Prediction:
+-------------------------------------------------------------+
|                                                              |
|  +---------+  +---------+  +----------+  +-------------+  |
|  |BigQuery |->| Dataflow|->| Batch    |->|  BigQuery   |  |
|  | (Input) |  |         |  | Predict  |  |  (Output)   |  |
|  +---------+  +---------+  +----------+  +-------------+  |
|                                                              |
+-------------------------------------------------------------+

Continuous Training:
+-------------------------------------------------------------+
|                                                              |
|  +---------+  +---------+  +---------+  +-------------+   |
|  | Data    |->| Feature |->| Train   |->|  Evaluate   |   |
|  | Source  |  | Eng.    |  | Model   |  |             |   |
|  +---------+  +---------+  +---------+  +------+------+   |
|                                                  |          |
|                              +-------------------+          |
|                              |                              |
|                         +----v----+                        |
|                         | Deploy  |                        |
|                         | if Pass |                        |
|                         +----+----+                        |
|                              |                              |
|                         +----v----+                        |
|                         | Monitor |                        |
|                         +----+----+                        |
|                              | (Drift detected)            |
|                              +----------> (Retrain)        |
+-------------------------------------------------------------+

Example Pipeline Code:

@dsl.pipeline(name='ml-cicd-pipeline')
def ml_pipeline():
    # Data validation
    validate = data_validation_component(
        dataset_uri='gs://bucket/data.csv'
    )
    
    # Feature engineering
    features = feature_engineering_component(
        input_data=validate.outputs['validated_data']
    ).after(validate)
    
    # Training
    train = training_component(
        training_data=features.outputs['features'],
        hyperparameters={'lr': 0.01}
    ).after(features)
    
    # Evaluation
    evaluate = evaluation_component(
        model=train.outputs['model'],
        test_data=features.outputs['test_features']
    ).after(train)
    
    # Conditional deployment
    with dsl.Condition(evaluate.outputs['accuracy'] > 0.9):
        deploy = deployment_component(
            model=train.outputs['model'],
            endpoint_name='production'
        )

Google Cloud Interview Questions


Popular Posts