AWS Certified AI Practitioner (AIF-C01) - Practice Test 2

Your Progress

0 / 50

Question 1 MEDIUM

A retail company has a large catalog of product descriptions stored in Amazon S3 and needs to run sentiment analysis on customer reviews at the end of each business day. Which inference type best meets these requirements?

Real-time inference
Streaming inference
Batch transform
Asynchronous inference

Batch transform is the optimal choice for scheduled, bulk processing tasks. It reads data from S3, processes it all at once without requiring a persistent endpoint, and writes results back to S3 -- making it cost-effective for end-of-day scheduled workloads. Real-time inference (A) keeps an always-on endpoint for immediate, low-latency responses, which is wasteful for scheduled jobs. Streaming inference (B) handles continuous data streams, not scheduled batches. Asynchronous inference (D) is designed for large single-request payloads with near-real-time responses, not bulk scheduled processing. See more: Amazon SageMaker

Question 2 MEDIUM

A medical research company wants to extract medication names, dosages, and patient conditions from thousands of clinical notes stored in Amazon S3. Which AWS service should the company use?

Amazon Kendra
Amazon Comprehend Medical
Amazon Transcribe Medical
Amazon Lex

Amazon Comprehend Medical is a HIPAA-eligible NLP service purpose-built to extract medical entities such as medications, dosages, diagnoses, tests, and procedures from unstructured clinical text. It also identifies Protected Health Information (PHI). Amazon Kendra (A) is an intelligent enterprise search service, not an NLP extraction tool. Amazon Transcribe Medical (C) converts medical speech to text -- it doesn't analyze existing text documents. Amazon Lex (D) builds conversational chatbots and cannot perform medical entity extraction. See more: AWS Managed AI Services

Question 3 MEDIUM

A data science team has a collection of thousands of unlabeled product images stored in Amazon S3. The team wants to use Amazon Bedrock to classify these images into product categories using batch processing to keep costs low. Which AWS compute service is most appropriate?

Amazon EC2
AWS Lambda
AWS Batch
Amazon ECS

AWS Batch is a fully managed service designed specifically for batch computing workloads. It automatically provisions the right amount of compute resources, manages job queues, and scales dynamically -- all without manual infrastructure management. This makes it ideal for cost-optimized batch image classification. Amazon EC2 (A) requires manual instance management, scaling, and job scheduling. AWS Lambda (B) has a 15-minute execution limit and limited compute, making it unsuitable for large batch jobs. Amazon ECS (D) requires managing container clusters and adds operational overhead. See more: AWS Cloud Computing

Question 4 MEDIUM

A law firm wants to build a chatbot that answers questions about its internal legal policies, billing procedures, and office locations. The chatbot must generate responses grounded in the firm's actual documents rather than general knowledge. Which generative AI approach best meets these requirements?

Zero-shot prompting
Model pre-training
Model fine-tuning
Retrieval Augmented Generation (RAG)

RAG dynamically retrieves relevant content from the firm's actual documents before generating a response, ensuring answers are grounded in real data. This also handles updates easily -- when policies change, only the document store needs updating, not the model. Zero-shot prompting (A) uses the model's built-in knowledge with no examples, but cannot pull from the firm's private documents. Model pre-training (B) requires training from scratch on massive data and is extremely expensive. Fine-tuning (C) embeds knowledge into model weights, which becomes stale when documents change and can still hallucinate details. See more: Bedrock & Generative AI

Question 5 EASY

Which term describes the maximum amount of text -- including both input and output -- that a large language model can process in a single request?

Temperature
Top P
Context window
Embedding

The context window (measured in tokens) defines the total amount of text an LLM can hold and process at one time, including the system prompt, conversation history, and generated output. For example, a model with a 200K token context window can process roughly 150,000 words. Temperature (A) controls randomness in output generation. Top P (B) is nucleus sampling, which restricts token selection to the most probable candidates. Embeddings (D) are dense numerical vector representations of text used for semantic search and similarity -- not a limit on processing capacity. See more: Bedrock & Generative AI

Question 6 MEDIUM

A news aggregation company wants to automatically detect the sentiment of articles about companies and generate concise summaries based on that sentiment. Which combination of AWS services meets these requirements? (Select TWO.)

Select all that apply

Amazon Comprehend analyzes text to detect sentiment (positive, negative, neutral, or mixed), meeting the sentiment detection requirement. Amazon Bedrock provides access to foundation models capable of generating accurate, contextually relevant summaries based on the detected sentiment. Amazon Transcribe (A) converts speech to text and cannot perform sentiment analysis. Amazon Rekognition (C) analyzes images and videos, not articles. Amazon Polly (E) converts text to speech -- neither can perform NLP analysis or text generation. See more: Bedrock & Generative AI

Question 7 EASY

An AI practitioner wants to make the outputs of a large language model more focused and deterministic, reducing random variation in responses. Which model parameter should the practitioner adjust?

Increase the context window
Decrease the temperature
Increase the number of epochs
Increase Top K

Temperature controls the randomness of token selection during generation. Decreasing temperature toward 0 makes the model select the highest-probability tokens more consistently, producing focused, deterministic, and repeatable outputs. Increasing the context window (A) allows more text to be processed but doesn't affect output randomness. Number of epochs (C) is a training hyperparameter, irrelevant during inference. Increasing Top K (D) actually expands the pool of candidate tokens, which increases output variation rather than reducing it. See more: Prompt Engineering

Question 8 EASY

A data engineering team needs to extract data from multiple relational databases, apply transformations, and load the processed data into Amazon S3 for machine learning. Which AWS service is most appropriate for this ETL workflow?

Amazon QuickSight
Amazon Athena
AWS Glue
Amazon Redshift

AWS Glue is a fully managed, serverless ETL service specifically designed for extracting, transforming, and loading data. It uses crawlers to automatically discover schemas, a centralized data catalog for metadata management, and Apache Spark-based transformation jobs -- all without managing infrastructure. Amazon QuickSight (A) is a BI and visualization tool for creating dashboards, not ETL. Amazon Athena (B) queries data in place using SQL but does not move or transform it. Amazon Redshift (D) is a data warehouse optimized for analytics, not a general-purpose ETL tool. See more: AWS Cloud Computing

Question 9 EASY

A company has deployed a customer service chatbot using Amazon Bedrock. The company wants to monitor how quickly the chatbot delivers responses to users in production. Which metric should the company track?

Loss
F1 score
Inference latency
Training accuracy

Inference latency measures the time from when a user submits a query to when the chatbot delivers its response. This is the critical performance metric for real-time, user-facing applications -- directly impacting user experience and satisfaction. Loss (A) measures how well the model learned during training and is irrelevant in production. F1 score (B) evaluates classification accuracy, not response speed. Training accuracy (D) measures correctness during model training, which is separate from production performance. See more: Bedrock & Generative AI

Question 10 MEDIUM

Which Amazon Bedrock feature enables a foundation model to automatically plan and execute a multi-step workflow -- such as querying a database, processing the results, and sending an email -- without human intervention?

Amazon Bedrock Guardrails
Amazon Bedrock Agents
Amazon Bedrock Knowledge Bases
Amazon Bedrock Model Evaluation

Amazon Bedrock Agents orchestrate multi-step workflows by breaking complex tasks into sub-steps, invoking external APIs through action groups, accessing knowledge bases, and maintaining context across interactions. They enable autonomous task execution without human intervention at each step. Guardrails (A) apply safety filters to inputs and outputs but don't execute tasks. Knowledge Bases (C) provide RAG-based data retrieval but cannot orchestrate multi-step actions. Model Evaluation (D) assesses model quality using test prompts and does not run workflows. See more: Bedrock & Generative AI

Question 11 EASY

A developer wants to control the style, format, and tone of responses from a deployed large language model without retraining the model. What is the most appropriate approach?

Adjust the learning rate
Increase the number of training epochs
Engineer the system and user prompts
Expand the model's context window

Prompt engineering -- crafting precise system prompts, user instructions, role definitions, and format specifications -- is the primary mechanism for controlling LLM output style, tone, and format at inference time without any retraining. Learning rate (A) is a training hyperparameter that controls how fast model weights update during training. Number of epochs (B) determines how many times the model trains over the dataset -- also a training concept. Context window (D) is a fixed model property defining max input size, not output style. See more: Prompt Engineering

Question 12 EASY

Which term describes the phase when a trained machine learning model analyzes new, previously unseen input data to produce predictions or outputs?

Pre-training
Fine-tuning
Inference
Validation

Inference is the production phase where a trained model receives new, unseen input data and generates predictions, classifications, or generated content. It occurs after training is complete. Pre-training (A) is the initial large-scale training of a foundation model on broad datasets. Fine-tuning (B) adapts an already-trained model to a specialized task using additional task-specific data. Validation (D) is a process during training that evaluates model performance on a held-out dataset to prevent overfitting -- not a production prediction phase. See more: AI/ML Fundamentals

Question 13 EASY

A company wants to automatically route incoming support tickets to the correct department based on the content of the message (e.g., billing, technical, returns). Which type of machine learning is most appropriate?

Anomaly detection
Clustering
Text classification using NLP
Linear regression

Text classification using NLP assigns predefined category labels to text inputs. In this case, a classifier trained on historical tickets learns to map message content to departments (billing, technical, returns). AWS services like Amazon Comprehend support custom classification. Anomaly detection (A) identifies unusual patterns rather than categorizing into predefined classes. Clustering (B) groups similar items without predefined labels -- it's unsupervised and wouldn't use known department categories. Linear regression (D) predicts continuous numerical values, not discrete categories. See more: AI/ML Fundamentals

Question 14 EASY

A manufacturing company wants to inspect product images on an assembly line and detect the exact location and type of defects in each image. Which type of ML model best meets this requirement?

Speech recognition
Text generation
Image segmentation
Object detection

Object detection models identify and locate specific objects within images by drawing bounding boxes around them and classifying each detected object. For defect detection on an assembly line, object detection can locate and classify defects (scratch, dent, crack) with precise coordinates. Amazon Rekognition provides pre-built object detection capabilities. Image segmentation (C) classifies each pixel but is typically more complex than needed for locating defects. Speech recognition (A) converts audio to text and Text generation (B) produces text output -- neither applies to image defect detection. See more: AWS Managed AI Services

Question 15 MEDIUM

A company is fine-tuning a large language model for hiring decisions and wants to ensure the model does not exhibit gender or racial bias in its recommendations. Which actions should the company take? (Select TWO.)

Select all that apply

Reduce the model's training time
Audit the training data for demographic imbalances
Lower the temperature parameter
Evaluate model outputs across diverse demographic groups
Increase the number of model layers

Auditing training data for demographic imbalances (using tools like Amazon SageMaker Clarify) identifies skewed representation before the model learns from biased patterns. Evaluating outputs across diverse demographic groups tests whether the model treats different groups equitably in actual recommendations. Reducing training time (A) might cause underfitting but doesn't remove bias. Lowering temperature (C) reduces randomness but doesn't change the underlying biased patterns the model learned. Adding layers (E) increases model capacity but doesn't address data-level bias. See more: AI Challenges & Responsibilities

Question 16 MEDIUM

A developer is building a multi-step document processing pipeline where a large language model first translates a document, then a second call summarizes the translation, and a third call generates action items from the summary. Which prompting technique describes this approach?

Zero-shot prompting
Few-shot prompting
Negative prompting
Prompt chaining

Prompt chaining connects multiple LLM calls sequentially, passing the output of one prompt as the input to the next. Each step in the chain builds on previous results: translate -> summarize -> generate action items. This enables complex multi-step reasoning that a single prompt cannot achieve. Zero-shot prompting (A) uses a single prompt with no examples. Few-shot prompting (B) provides examples within a single prompt. Negative prompting (C) instructs the model what to avoid in a single prompt -- none of these chain outputs between separate LLM calls. See more: Prompt Engineering

Question 17 HARD

A biotech company has deployed a foundation model from Amazon Bedrock to answer questions about pharmaceutical research. Despite extensive prompt engineering, the model struggles with highly specialized drug interaction terminology. What is the best solution to improve performance?

Switch to a larger general-purpose model.
Use continued pre-training or domain adaptation fine-tuning on pharmaceutical literature.
Increase the temperature parameter.
Remove technical terminology from user queries.

Domain adaptation fine-tuning trains the model on domain-specific pharmaceutical text, teaching it the specialized vocabulary, abbreviations, and concepts used in drug research. Since prompt engineering has already been exhausted, the model fundamentally lacks the domain knowledge -- fine-tuning addresses the root cause. Switching to a larger general model (A) may help marginally but won't solve specialized terminology gaps. Increasing temperature (C) adds randomness, which could make the model less reliable with technical content. Removing technical terms (D) would eliminate the critical information needed to answer pharmaceutical questions accurately. See more: Bedrock & Generative AI

Question 18 MEDIUM

A team has built a binary classification model to detect fraudulent transactions. The team wants to measure how well the model distinguishes between fraud and non-fraud cases. Which evaluation approach is most appropriate?

Mean squared error (MSE)
R2 score
Confusion matrix with precision and recall
Correlation matrix

A confusion matrix captures true positives (fraud correctly flagged), false positives (legitimate transactions wrongly flagged), true negatives, and false negatives (missed fraud). From it, precision and recall are derived, which are critical for fraud detection -- especially given class imbalance where fraud cases are rare. High recall minimizes missed fraud; high precision minimizes false alarms. MSE (A) and R2 (B) are regression metrics for continuous predictions, not binary classification. A correlation matrix (D) shows linear relationships between numerical variables, not classification outcomes. See more: AI/ML Fundamentals

Question 19 MEDIUM

A company wants to use an Amazon Bedrock foundation model to answer questions using only information from the company's proprietary internal wiki and policy documents. Which approach should the company implement?

Adjust the model's inference parameters
Enable model invocation logging
Build an Amazon Bedrock knowledge base connected to the internal documents
Switch to a different foundation model

Amazon Bedrock Knowledge Bases implement RAG by ingesting private documents, chunking and embedding the content into a vector store, and dynamically retrieving relevant passages at inference time to ground responses in actual company data. This keeps the model's answers accurate and up-to-date with internal content. Adjusting inference parameters (A) affects output style but doesn't connect the model to private documents. Invocation logging (B) records API calls for monitoring, not knowledge integration. Switching models (D) changes the foundation model but doesn't solve the knowledge access problem. See more: Bedrock & Generative AI

Question 20 EASY

A team is fine-tuning a foundation model using Amazon Bedrock and needs to upload a 50,000-row validation dataset so Bedrock can evaluate the model after training. Which AWS service should the team use to store this dataset?

Amazon Elastic File System (Amazon EFS)
Amazon S3
Amazon DynamoDB
Amazon Elastic Block Store (Amazon EBS)

Amazon S3 is the only AWS storage service that Amazon Bedrock integrates with for training and validation datasets. Bedrock reads data directly from S3 buckets in supported formats such as JSONL. Amazon EFS (A) is a managed file system for EC2 and Lambda -- Bedrock cannot directly read from it. Amazon DynamoDB (C) is a NoSQL database for low-latency key-value lookups, not a dataset repository for Bedrock. Amazon EBS (D) provides block-level storage attached to EC2 instances and is not accessible to Bedrock. See more: Bedrock & Generative AI

Question 21 MEDIUM

A company is deploying an AI-powered medical documentation assistant using a fine-tuned model on Amazon SageMaker JumpStart. The solution must comply with HIPAA and SOC 2 standards. Which two capabilities are most relevant for demonstrating regulatory compliance? (Select TWO.)

Select all that apply

Data protection and encryption are core requirements of HIPAA (protecting PHI) and SOC 2 (security principle). This includes KMS encryption, access controls, and data handling policies. Threat detection and monitoring using services like Amazon GuardDuty and AWS CloudTrail fulfills audit and security monitoring requirements mandated by both HIPAA and SOC 2. Auto scaling (A) and cost tags (B) are operational/financial practices, not compliance evidence. Microservices architecture (E) is a design pattern, not a compliance control. See more: AI Challenges & Responsibilities

Question 22 MEDIUM

A government agency is deploying a generative AI application in a VPC with strict network policies that prohibit all outbound internet traffic. The application needs to call Amazon Bedrock APIs. Which solution satisfies this requirement?

Configure a NAT Gateway
Use AWS PrivateLink to create a VPC interface endpoint for Amazon Bedrock
Use Amazon CloudFront as a proxy
Configure a VPC peering connection

AWS PrivateLink creates a VPC interface endpoint that enables private, direct communication between the VPC and Amazon Bedrock entirely over the AWS private network -- with zero internet traffic. This is the standard solution for VPCs with internet restrictions. A NAT Gateway (A) routes traffic through the public internet via an Elastic IP, violating the no-internet policy. Amazon CloudFront (C) is an internet-facing CDN that would introduce internet traffic. VPC peering (D) connects two VPCs together but does not provide access to AWS services like Bedrock. See more: AWS Security Services

Question 23 EASY

A company receives thousands of social media mentions every day and wants to identify emerging trends, recurring themes, and frequently discussed topics from this text data. Which AWS service best meets this requirement?

Amazon Transcribe
Amazon Polly
Amazon Comprehend
Amazon Rekognition

Amazon Comprehend is an NLP service that provides key phrase extraction, entity recognition, topic modeling, and sentiment analysis across large volumes of text. Its topic modeling capability clusters similar posts to reveal recurring themes and trends in social media data. Amazon Transcribe (A) converts speech audio to text -- it doesn't process existing text. Amazon Polly (B) is a text-to-speech service. Amazon Rekognition (D) analyzes images and videos, not text content. See more: AWS Managed AI Services

Question 24 EASY

A team has trained an image classification model to categorize satellite images into land-use types. How should the team evaluate whether the model meets their accuracy requirements?

Count the total number of parameters in the model.
Test the model's predictions against a labeled benchmark dataset.
Measure the time it takes to process one image.
Calculate the storage cost of the model artifacts.

Evaluating model accuracy requires comparing its predictions against ground truth labels in a held-out benchmark dataset. Metrics like accuracy, precision, recall, and F1 score quantify the classification quality. Using a standard benchmark also enables fair comparison across models. Parameter count (A) reflects model size and complexity, not predictive performance. Processing time (C) measures computational efficiency, not classification accuracy. Storage cost (D) is a financial consideration, unrelated to model quality. See more: AI/ML Fundamentals

Question 25 MEDIUM

A marketing team wants to select a foundation model from Amazon Bedrock based on which model produces content that best matches their brand's creative writing style. What is the most effective evaluation approach?

Select the model with the highest score on public NLP benchmarks.
Choose the model with the lowest inference latency.
Have human reviewers evaluate model outputs using brand-specific prompt datasets.
Select the model with the most parameters.

Creative writing style alignment is a subjective, brand-specific quality that cannot be measured by generic benchmarks. Amazon Bedrock's human evaluation workflow allows internal reviewers to rate model outputs based on brand voice, tone, and quality using prompts drawn from real marketing use cases. Public benchmarks (A) rank models on general language tasks, not brand-specific style. Lowest latency (B) optimizes speed, not creative quality. Most parameters (D) indicates model capacity but doesn't predict which model best fits a specific brand's voice. See more: Bedrock & Generative AI

Question 26 EASY

A company deployed an AI assistant to help field agents answer customer insurance policy questions faster. The company wants to quantify the productivity improvement. Which business metric is most relevant for evaluating this impact?

Employee satisfaction score
Average time to resolve customer inquiries
Number of model parameters
Website session duration

Average time to resolve customer inquiries directly measures how much faster agents answer questions with AI assistance -- the core productivity metric. If the AI assistant reduces resolution time, it demonstrates a concrete productivity gain. Employee satisfaction (A) is a useful HR metric but doesn't directly quantify the AI's operational impact on productivity. Number of model parameters (C) is a technical attribute unrelated to business outcomes. Website session duration (D) measures online engagement, not agent efficiency. See more: AI/ML Fundamentals

Question 27 HARD

An AI engineer fine-tuned a model on Amazon Bedrock using a dataset that accidentally included employees' personal identification numbers (PINs) and account numbers. What is the correct remediation?

Apply a post-processing filter to mask sensitive numbers in all responses.
Delete the model, scrub the sensitive data from the training dataset, and retrain.
Encrypt the model weights using AWS KMS to protect the embedded data.
Apply Amazon Bedrock Guardrails to block numeric outputs.

When sensitive data is present in training data, it becomes encoded in the model's weights and may be reproduced in responses. The only reliable fix is to delete the compromised model, remove all sensitive data from the training dataset, and retrain from scratch on the clean dataset. Post-processing filters (A) might catch some outputs but cannot guarantee the model won't expose sensitive data in novel ways. Encrypting weights (C) secures the model file but doesn't prevent the model from outputting sensitive information during inference. Guardrails (D) provide a partial defense but cannot comprehensively block all sensitive numeric patterns. See more: AI Challenges & Responsibilities

Question 28 MEDIUM

A company has deployed a foundation model as a customer-facing assistant. The company wants to prevent adversarial users from using prompt injection attacks to override the model's safety instructions. Which action most effectively reduces this risk?

Decrease the model's temperature to 0.
Use a system prompt with explicit instructions to detect and resist override attempts.
Limit the maximum number of output tokens.
Use only models listed in Amazon SageMaker JumpStart.

A hardened system prompt that instructs the model to recognize and reject prompt injection patterns, ignore instructions to override guidelines, and refuse requests that attempt to change its behavior is the most direct defense against adversarial prompting. Decreasing temperature (A) makes outputs more deterministic but doesn't remove the model's susceptibility to well-crafted adversarial inputs. Limiting output tokens (C) restricts response length but doesn't prevent the model from being manipulated. Restricting to SageMaker JumpStart models (D) limits model choice but doesn't address prompt-level security. See more: AI Challenges & Responsibilities

Question 29 EASY

A startup has a pre-trained vision model for recognizing everyday objects. The company wants to adapt this model to identify specific industrial equipment parts without training from scratch. Which approach should the company use?

Use unsupervised clustering.
Increase the learning rate for all layers.
Apply transfer learning.
Reduce the number of model parameters.

Transfer learning takes a model that has already learned general features (edges, textures, shapes) from broad training and adapts it to a new related task -- in this case, industrial equipment recognition -- using a much smaller labeled dataset. This avoids training from scratch, saving significant time and compute. Unsupervised clustering (A) groups similar items without labels and is not a model adaptation strategy. Increasing the learning rate indiscriminately (B) can destabilize training and destroy previously learned features. Reducing parameters (D) simplifies the model architecture but doesn't leverage pre-trained knowledge. See more: AI/ML Fundamentals

Question 30 MEDIUM

Using the Generative AI Security Scoping Matrix, which deployment scenario gives the company the LEAST ownership of security responsibilities?

Building and training a custom generative AI model from scratch using proprietary data.
Fine-tuning an existing third-party foundation model with company-specific data.
Building an application that calls a third-party foundation model API.
Using a SaaS product with embedded generative AI features managed by a vendor.

Using a fully managed SaaS product with built-in AI features means the vendor controls the model, infrastructure, data handling, and security layers -- leaving the company with minimal security responsibilities. This represents the least ownership. Building from scratch (A) gives maximum ownership; the company is responsible for everything from data to infrastructure. Fine-tuning (B) shares responsibility with the FM provider. Building on an FM API (C) offloads model security but retains application-level responsibilities. Least to most ownership: SaaS -> API -> Fine-tuning -> From scratch. See more: AI Challenges & Responsibilities

Question 31 MEDIUM

A machine learning team wants to quickly deploy a pre-trained large language model within their AWS VPC for a secure, low-latency internal application. The team does not want to manage Kubernetes or container orchestration. Which AWS service best meets these requirements?

Amazon Personalize
Amazon EKS
Amazon SageMaker JumpStart with SageMaker endpoints
PartyRock

Amazon SageMaker JumpStart offers one-click deployment of pre-trained foundation models, and SageMaker endpoints host those models within your VPC for secure, managed, low-latency inference without managing any container orchestration infrastructure. Amazon Personalize (A) is a recommendation service for personalization use cases, not general FM deployment. Amazon EKS (B) requires managing Kubernetes clusters -- adding significant operational overhead. PartyRock (D) is a public, browser-based playground for generative AI exploration, not suitable for VPC-based production deployment. See more: Amazon SageMaker

Question 32 EASY

A developer uses few-shot prompting with a base model on Amazon Bedrock, currently including 12 examples in each prompt. The model is called once per week and performs well. The company wants to reduce monthly costs. Which change will most directly reduce costs?

Purchase Provisioned Throughput for the model.
Fine-tune the model to internalize the examples.
Reduce the number of few-shot examples to lower the token count per request.
Enable model invocation logging.

Amazon Bedrock charges per input and output token with On-Demand pricing. Reducing the number of few-shot examples (e.g., from 12 to 4) directly shrinks the input token count per request, lowering the cost of each invocation. Since the model already performs well, fewer examples may still yield acceptable quality. Provisioned Throughput (A) guarantees dedicated capacity with hourly commitments -- far more expensive for a weekly invocation. Fine-tuning (B) incurs training costs and requires Provisioned Throughput, which is cost-prohibitive for a once-weekly call. Invocation logging (D) records API calls but doesn't affect cost. See more: Bedrock & Generative AI

Question 33 EASY

A call center wants to analyze recorded customer phone calls to extract complaints, identify product names mentioned, and detect customer sentiment. Which combination of services provides the most direct path to these insights?

Amazon Lex followed by Amazon QuickSight.
Amazon Transcribe to convert audio to text, then Amazon Comprehend for NLP analysis.
Amazon Rekognition followed by Amazon Comprehend.
Amazon Polly followed by Amazon Translate.

Amazon Transcribe converts recorded call audio into text transcripts (with support for speaker diarization and custom vocabularies). Amazon Comprehend then analyzes the transcripts to extract key phrases, detect named entities (product names), and identify sentiment -- providing all the requested insights. Amazon Lex (A) builds chatbots for interactive conversations, not call recording analysis. Amazon Rekognition (C) analyzes images and video, not audio. Amazon Polly (D) generates speech from text -- the opposite of what's needed. See more: AWS Managed AI Services

Question 34 MEDIUM

An AI company undergoes periodic security audits from third-party auditors. The company needs to receive automated email notifications when new audit reports or compliance certifications become available in AWS. Which AWS service provides access to these compliance documents?

AWS Trusted Advisor
Amazon Inspector
AWS Artifact
AWS Security Hub

AWS Artifact is the central self-service portal for accessing AWS compliance reports, certifications (SOC, PCI, ISO), and security documents from third-party auditors and ISVs. It provides on-demand document access and notification capabilities when new reports become available. AWS Trusted Advisor (A) provides best-practice recommendations for cost, security, and performance -- not compliance documentation. Amazon Inspector (B) scans workloads for software vulnerabilities and network exposures, not compliance report distribution. AWS Security Hub (D) aggregates security findings across services but doesn't distribute third-party compliance reports. See more: AWS Security Services

Question 35 EASY

A team is evaluating a named entity recognition model and wants a single metric that balances both the percentage of correct predictions and the percentage of actual entities the model successfully found. Which metric should the team use?

Inference latency
Training loss
F1 score
R2 score

The F1 score is the harmonic mean of precision (percentage of predicted entities that are correct) and recall (percentage of actual entities correctly identified). It balances both concerns -- important for NER where you want to minimize both false positives and missed entities. A model with high precision but low recall (finds few entities) or high recall but low precision (generates many false positives) would have a low F1 score. Inference latency (A) measures speed. Training loss (B) measures learning progress during training. R2 score (D) evaluates regression models, not classification/extraction tasks. See more: AI/ML Fundamentals

Question 36 MEDIUM

An editor has scanned thousands of historical documents, but OCR errors have left many words incomplete or corrupted. The company wants an ML model to predict the most likely original word based on surrounding context. Which type of model is best suited for this task?

Time series forecasting model
BERT-based masked language model
Generative adversarial network (GAN)
Clustering model

BERT was pre-trained using Masked Language Modeling (MLM), where random tokens are masked and the model learns to predict them from bidirectional context. This capability makes BERT-based models ideal for predicting corrupted or missing words in historical documents. Time series models (A) predict numerical values over time, not text words. GANs (C) generate new content (images, audio) but are not designed for in-context word prediction. Clustering models (D) group similar items without predicting specific values. See more: Bedrock & Generative AI

Question 37 EASY

A company built a multimodal AI assistant that responds to text questions with generated images. The company wants to automatically prevent the assistant from returning sexually explicit or violent images before they reach users. Which solution is most appropriate?

Implement Amazon Rekognition content moderation to filter images before delivery.
Retrain the model on a cleaner dataset.
Use a lower temperature setting during image generation.
Rely on user reports to flag inappropriate images after the fact.

Amazon Rekognition Content Moderation uses automated ML to detect explicit nudity, graphic violence, suggestive content, and other unsafe categories in images in real time, enabling filtering before images are delivered to users. Retraining on cleaner data (B) reduces unsafe output probability but cannot guarantee safety in production. Lower temperature (C) makes outputs more predictable but doesn't specifically prevent inappropriate content. Relying on user reports (D) is entirely reactive -- users have already been exposed to harmful content before any action is taken. See more: AWS Managed AI Services

Question 38 MEDIUM

A team trained a demand forecasting model that achieved excellent performance on the training dataset. After deploying to production, predictions for new time periods were significantly less accurate. What is the most likely cause, and what is the recommended fix?

Underfitting -- increase model complexity.
Overfitting -- increase the size and diversity of the training dataset.
Data leakage -- re-split training and test data.
High variance in hyperparameters -- tune the learning rate.

A model that performs well on training data but poorly in production is a classic sign of overfitting -- it memorized training patterns rather than learning generalizable relationships. The recommended fix is to increase training data volume and diversity so the model encounters more varied examples and learns broader patterns. Underfitting (A) causes poor performance on both training and production data -- not just production. Data leakage (C) causes artificially inflated training metrics from test data contamination, which is a different problem. Hyperparameter tuning (D) is a useful optimization step but doesn't address the fundamental cause of overfitting. See more: AI/ML Fundamentals

Question 39 EASY

A company wants to ensure its use of Amazon Bedrock foundation models follows AWS security best practices. Which combination of actions provides the most comprehensive security posture?

Design clear, specific prompts with safety instructions, and configure IAM roles with least privilege access.
Enable Amazon SageMaker Model Monitor and Amazon CloudWatch dashboards.
Enable automated model evaluation and use prompt templates from the AWS Samples library.
Use Provisioned Throughput and enable model invocation logging.

Two controls form the foundation of secure LLM use: clear, specific prompts with system-level safety instructions reduce the attack surface for prompt injection and unintended outputs. IAM roles configured with least privilege ensure only authorized principals can invoke Bedrock APIs, limiting exposure in the event of credential compromise. SageMaker Model Monitor (B) tracks model drift, not Bedrock security. Automated model evaluation (C) assesses quality, not security posture. Provisioned Throughput (D) manages capacity for cost and performance, not access security. See more: AI Challenges & Responsibilities

Question 40 EASY

An AI practitioner is generating marketing copy using a large language model. The generated text reads convincingly but includes invented statistics and false product claims. Which AI problem is occurring?

Overfitting
Hallucination
Underfitting
Data leakage

Hallucination occurs when an LLM generates text that sounds confident and coherent but contains fabricated or incorrect information -- invented statistics, false claims, or non-existent sources. The model produces plausible-sounding content because it generates based on statistical language patterns rather than factual grounding. Mitigation strategies include RAG, grounding with external knowledge, and human review. Overfitting (A) causes poor generalization to new data but doesn't explain confident factual errors. Underfitting (C) produces poor-quality outputs broadly. Data leakage (D) involves test data contaminating training, causing inflated performance metrics. See more: Bedrock & Generative AI

Question 41 EASY

In the context of large language models, what is a token?

A security credential used to authenticate API requests to an LLM provider.
A basic unit of text -- such as a word, subword, or character -- that an LLM processes as input or output.
A hyperparameter that controls how many words the model generates per second.
A numerical measurement of model computational efficiency.

Tokens are the atomic units of text that LLMs process. A tokenizer splits input text into tokens, which can be whole words, subwords (e.g., 'playing' -> 'play' + '##ing'), or individual characters depending on the model. English text averages approximately 1 token per 4 characters. Token counts determine context window limits and directly drive per-API-call pricing. Security credentials (A) are API keys or authentication tokens -- a different concept entirely. Generation speed (C) is measured in tokens per second. Computational efficiency (D) is measured in FLOPS or GPU utilization. See more: Bedrock & Generative AI

Question 42 MEDIUM

A city's parking enforcement agency uses an ML model that analyzes parking violation footage. Internal audits revealed the model issues disproportionately more citations to vehicles registered in lower-income zip codes. What type of issue is this?

Sampling bias in the model
Observer bias
Algorithmic bias
Measurement bias

Algorithmic bias occurs when the ML model itself produces systematically discriminatory outputs -- in this case, the model's learned decision boundaries disproportionately flag vehicles based on geographic proxies for socioeconomic status. This likely stems from biased historical enforcement patterns in the training data. Sampling bias (A) is a data collection problem where certain groups are over- or under-represented in the training sample. Observer bias (B) is a human cognitive bias affecting data collection by researchers. Measurement bias (D) results from flawed instruments or inconsistent measurement methodologies. See more: AI Challenges & Responsibilities

Question 43 MEDIUM

A financial services company is hosting a sensitive AI compliance application on Amazon Bedrock inside a VPC. Regulatory requirements mandate that no data can leave the AWS network boundary. Which solution ensures Amazon Bedrock API calls never traverse the public internet?

Amazon CloudFront distribution in front of Bedrock
Internet gateway with security group restrictions
VPC endpoint (interface endpoint) for Amazon Bedrock
AWS Direct Connect

A VPC interface endpoint for Amazon Bedrock (powered by AWS PrivateLink) creates a private connection that routes all Bedrock API traffic through the AWS internal network, never touching the public internet. This satisfies the regulatory requirement. Amazon CloudFront (A) is an internet-facing CDN, which would route traffic over the public internet. An internet gateway (B) explicitly enables internet access for the VPC, violating the requirement even with security group restrictions. AWS Direct Connect (D) provides private connectivity from on-premises data centers to AWS but doesn't isolate intra-AWS service communication from the internet. See more: AWS Security Services

Question 44 MEDIUM

A company uses Amazon SageMaker to develop ML models collaboratively across multiple data science teams. The teams need a centralized way to store, discover, and reuse computed features (such as customer lifetime value or rolling 30-day purchase counts) to avoid duplicate work. Which SageMaker capability meets this requirement?

Amazon SageMaker Clarify
Amazon SageMaker Data Wrangler
Amazon SageMaker Model Registry
Amazon SageMaker Feature Store

Amazon SageMaker Feature Store is a purpose-built repository for storing, discovering, and sharing ML features across teams. It provides an online store for real-time inference access (millisecond latency) and an offline store for batch training. Teams discover and reuse the same standardized features, ensuring consistency and eliminating redundant computation. SageMaker Clarify (A) detects bias in data and models and provides explainability reports, not feature management. Data Wrangler (B) helps with data preparation and transformation pipelines. Model Registry (C) manages model versions, metadata, and deployment approvals. See more: Amazon SageMaker

Question 45 MEDIUM

A digital devices company wants to predict customer demand for memory hardware. The company does not have coding experience or knowledge of ML algorithms and needs to develop a data-driven predictive model. The company needs to perform analysis on internal data and external data. Which solution will meet these requirements?

Store the data in Amazon S3. Create ML models and demand forecast predictions by using Amazon SageMaker built-in algorithms that use the data from Amazon S3.
Import the data into Amazon SageMaker Data Wrangler. Create ML models and demand forecast predictions by using SageMaker built-in algorithms.
Import the data into Amazon SageMaker Data Wrangler. Build ML models and demand forecast predictions by using an Amazon Personalize Trending-Now recipe.
Import the data into Amazon SageMaker Canvas. Build ML models and demand forecast predictions by selecting the values in the data from SageMaker Canvas.

Amazon SageMaker Canvas is a no-code ML tool with a visual, point-and-click interface that lets business users without coding or ML knowledge build predictive models. Users simply import data, select target columns, and Canvas automatically handles feature engineering, algorithm selection, and training. SageMaker built-in algorithms (A, B) require Python coding and ML knowledge. Data Wrangler (B, C) requires understanding of data transformations. Amazon Personalize (C) is for recommendation engines, not demand forecasting. See more: Amazon SageMaker

Question 46 EASY

A company needs to ensure their application remains available even if an entire AWS data center experiences a power failure. What is the MINIMUM architecture they should implement?

Deploy to a single large EC2 instance
Deploy across multiple Availability Zones within one Region
Deploy across multiple AWS Regions
Use Amazon CloudFront edge locations

Each Availability Zone (AZ) consists of one or more data centers with independent power, cooling, and networking. Deploying across multiple AZs within a single Region provides high availability and fault tolerance against data center failures. A single instance (A) has no redundancy. Multi-Region (C) is more than the minimum needed for a single data center failure. CloudFront (D) is for content caching, not application availability. See more: AWS Cloud Computing

Question 47 EASY

Which of the following AWS services is considered a GLOBAL service that is NOT tied to a specific AWS Region?

Amazon EC2
Amazon S3 buckets
AWS Identity and Access Management (IAM)
Amazon RDS

AWS IAM is a global service -- users, groups, roles, and policies created in IAM are available across all AWS Regions without replication. EC2 instances (A), S3 buckets (B), and RDS databases (D) are all regional services where resources exist only in the region where they were created. See more: AWS Cloud Computing

Question 48 EASY

A cloud engineer wants to ask 'What are my top 3 highest-cost AWS services this month?' and get an answer without writing any queries. Which AWS tool provides this capability?

AWS Cost Explorer dashboard
Amazon Q Developer in the AWS Console
AWS Budgets reports
Amazon CloudWatch billing metrics

Amazon Q Developer in the AWS Console can analyze your AWS bill and answer natural language questions about cost drivers, resource usage, and account state. It provides conversational access to account information without requiring knowledge of specific AWS tools or query syntax. Cost Explorer (A), Budgets (C), and CloudWatch (D) require manual navigation and configuration. See more: Amazon Q

Question 49 MEDIUM

A marketing manager with no coding skills wants to build an AI tool that generates social media posts using internal brand guidelines. Which AWS feature enables this?

Amazon Bedrock Playground
Amazon Q Apps
AWS Lambda with Bedrock API
PartyRock

Amazon Q Apps is a no-code GenAI application builder within Amazon Q Business. Non-technical employees can describe app requirements in natural language and Q Apps generates working applications connected to internal company data. Bedrock Playground (A) requires technical knowledge. Lambda (C) requires coding. PartyRock (D) doesn't connect to internal company data. See more: Amazon Q

Question 50 EASY

A company wants to automatically detect if any PII (personally identifiable information) exists in their S3 buckets before using the data to train ML models. Which AWS service should they use?

Amazon Comprehend
Amazon Macie
AWS Config
Amazon GuardDuty

Amazon Macie is a fully managed ML-powered service that scans S3 buckets to discover, classify, and alert on sensitive data -- particularly PII. It's the recommended pre-training step to ensure training data doesn't contain personal information. Comprehend (A) detects PII in text input, not S3 files. Config (C) monitors resource configuration. GuardDuty (D) detects security threats, not PII. See more: AWS Security Services

Take Practice Test 3 →

Search Tutorials

AWS Certified AI Practitioner (AIF-C01) - Practice Test 2

Your Progress

Popular Posts