AZ-305 - Design a Data Protection Strategy
1. Data Geo-Replication
Geo-replication ensures that your data is available across multiple Azure regions, protecting against regional outages and enabling low-latency reads for globally distributed users. Each Azure data service provides its own geo-replication mechanism.
Azure SQL Geo-Replication
Active Geo-Replication
Active geo-replication in Azure SQL Database creates continuously synchronized readable secondary databases in the same or different Azure regions. It uses asynchronous replication with an RPO of less than 5 seconds.
You can configure up to four readable secondaries and use them for read-only query workloads, offloading reporting from the primary.
Auto-Failover Groups
Auto-failover groups build on top of active geo-replication and provide automatic failover with a single read-write listener endpoint. When the primary region experiences an outage, failover is triggered automatically and client applications reconnect without code changes.
Azure Cosmos DB Geo-Replication
Azure Cosmos DB supports turnkey global distribution. You can add or remove regions at any time without pausing the application. Cosmos DB uses multi-master replication, allowing writes in every replicated region for low-latency access. The consistency model (Strong, Bounded Staleness, Session, Consistent Prefix, or Eventual) determines how quickly data converges across regions.
Azure Storage Geo-Replication
Azure Storage offers several redundancy options for geo-replication:
- GRS (Geo-Redundant Storage): Replicates data to a secondary region asynchronously. Data in the secondary region is not available for read access unless a failover occurs.
- RA-GRS (Read-Access Geo-Redundant Storage): Same as GRS but allows read-only access to data in the secondary region at all times.
- GZRS (Geo-Zone-Redundant Storage): Combines zone-redundant storage in the primary region with geo-replication to a secondary region.
- RA-GZRS (Read-Access Geo-Zone-Redundant Storage): Same as GZRS with read access to the secondary region.
2. Data Encryption
Encryption protects data from unauthorized access at every stage of its lifecycle. Azure provides encryption at rest, in transit, and in use.
Encryption at Rest
Service-Managed Keys vs Customer-Managed Keys
By default, Azure encrypts all data at rest using service-managed keys (SSE with Microsoft-managed keys). For additional control, you can use customer-managed keys (CMK) stored in Azure Key Vault. CMKs allow you to rotate, revoke, and audit key usage independently.
Transparent Data Encryption (TDE)
TDE encrypts the storage of an entire Azure SQL database without requiring changes to application code. It performs real-time encryption and decryption of the database, associated backups, and transaction log files. TDE is enabled by default for all newly created Azure SQL databases.
Always Encrypted
Always Encrypted protects sensitive data inside Azure SQL Database by encrypting data on the client side before it reaches the database engine. The database engine never sees the plaintext. This is ideal for scenarios requiring separation between data owners and data administrators. Column master keys are stored in Azure Key Vault or a Windows certificate store.
Encryption in Transit
Azure enforces TLS 1.2 by default for data in transit. Services such as Azure Storage, SQL Database, and Cosmos DB all support HTTPS endpoints. You can also enforce a minimum TLS version on storage accounts to reject connections using older protocols.
| Encryption Type | Scope | Key Management | Use Case |
|---|---|---|---|
| SSE (Service-Managed Keys) | Data at rest | Microsoft-managed | Default protection for storage |
| SSE (Customer-Managed Keys) | Data at rest | Customer via Key Vault | Regulatory compliance |
| TDE | SQL database files | Service or customer-managed | SQL database encryption |
| Always Encrypted | SQL column data | Customer (Key Vault/cert store) | Client-side encryption |
| TLS 1.2 | Data in transit | Certificate-based | Secure communication |
3. Data Scaling
Vertical Scaling (Scale Up/Down)
Vertical scaling increases the resources (CPU, memory, IOPS) of a single instance. For Azure SQL, this means moving to a higher service tier or more DTUs/vCores. Vertical scaling provides immediate performance improvement but has an upper limit defined by the largest available instance size.
Horizontal Scaling (Scale Out/In)
Horizontal scaling distributes data across multiple instances or partitions. This approach provides virtually unlimited scalability by adding more nodes.
Sharding
Sharding divides data across multiple databases based on a shard key. Azure SQL Elastic Database tools provide a shard map manager that routes queries to the correct shard. Choosing the right shard key is critical: it should evenly distribute data and minimize cross-shard queries.
Cosmos DB Scaling
Cosmos DB uses partition keys to distribute data across logical and physical partitions. You can provision throughput in Request Units per second (RU/s) at the database or container level. Autoscale mode automatically adjusts throughput between 10% and 100% of the configured maximum based on workload demand.
4. Data Security
Network Security Rules
Azure data services support virtual network service endpoints and private endpoints to restrict access at the network level. Service endpoints extend the virtual network identity to the Azure service, while private endpoints assign a private IP address from your VNet to the service.
Private Endpoints vs Service Endpoints
Private endpoints provide a private IP address within your VNet and eliminate exposure to the public internet entirely. Service endpoints route traffic over the Azure backbone network but the service still has a public IP. For maximum security, prefer private endpoints.
Firewall Rules
Azure SQL Database, Cosmos DB, and Storage Accounts all support IP-based firewall rules. You can allow traffic from specific IP ranges or Azure services while blocking all other traffic. For Azure SQL, server-level and database-level firewall rules can be configured independently.
Role-Based Access Control (RBAC)
Azure RBAC controls who can manage data resources at the management plane. For the data plane, services like Cosmos DB support both RBAC and resource tokens for fine-grained access. Azure SQL uses database-level roles and Azure AD authentication for data-plane security.
5. Data Loss Prevention
DLP Policies
Data Loss Prevention (DLP) policies identify, monitor, and protect sensitive information across Azure services. Microsoft Purview provides a unified data governance platform that integrates DLP policies with data classification and sensitivity labels.
Azure Purview
Azure Purview (now Microsoft Purview) scans data sources to discover, classify, and label sensitive data. It creates a unified data map across on-premises, multi-cloud, and SaaS environments. Sensitivity labels from Microsoft Information Protection can be applied automatically based on classification rules.
Soft Delete and Backup Retention
Soft delete protects against accidental data deletion. Azure Storage supports soft delete for blobs and containers, allowing recovery within a retention period. Azure SQL provides point-in-time restore (PITR) with a 7-to-35-day retention window and long-term retention (LTR) for up to 10 years.
Azure Backup
Azure Backup provides a centralized solution for protecting Azure VMs, SQL databases, file shares, and blob storage. Backup data is stored in a Recovery Services vault with built-in encryption and geo-redundancy options.
Key Terms
| Term | Definition |
|---|---|
| Active Geo-Replication | Azure SQL feature that creates continuously synchronized readable secondary databases in other regions with an RPO of less than 5 seconds. |
| TDE (Transparent Data Encryption) | Encrypts SQL database storage at rest without application code changes; enabled by default on new databases. |
| Always Encrypted | Client-side encryption feature for Azure SQL that keeps data encrypted even inside the database engine. Only the client application can decrypt. |
| Customer-Managed Keys (CMK) | Encryption keys stored and managed by the customer in Azure Key Vault, providing full control over rotation and revocation. |
| Sharding | Horizontal partitioning strategy that distributes data across multiple databases using a shard key for scalability. |
| Private Endpoint | A network interface that uses a private IP from your VNet to connect securely to an Azure service, eliminating public internet exposure. |
| Azure Purview | Unified data governance service that discovers, classifies, and labels sensitive data across multi-cloud and on-premises environments. |
| RA-GRS | Read-Access Geo-Redundant Storage: provides geo-replication with read-only access to the secondary region at all times. |
Exam Tips
- Know the difference between active geo-replication and auto-failover groups. Auto-failover groups provide automatic failover with a single listener endpoint; active geo-replication requires manual failover.
- Always Encrypted is the correct choice when data must be protected from database administrators. TDE protects data at rest but the database engine can see plaintext during query processing.
- Private endpoints provide stronger network isolation than service endpoints because they assign a private IP and remove public internet exposure entirely.
- For Azure Storage, RA-GRS and RA-GZRS allow read access to the secondary region without failover. GRS and GZRS do not allow reads from the secondary until failover occurs.
- Azure Purview is the answer for data governance, classification, and DLP across hybrid and multi-cloud environments.
- Remember that Cosmos DB consistency levels range from Strong (highest latency, strongest guarantee) to Eventual (lowest latency, weakest guarantee). Session consistency is the default.
Practice Questions
Question 1
Your organization stores sensitive financial data in Azure SQL Database. Database administrators should NOT be able to view the plaintext values of credit card columns. Which feature should you use?
A. Transparent Data Encryption (TDE)
B. Always Encrypted
C. Dynamic Data Masking
D. Row-Level Security
Answer: B
Always Encrypted encrypts data on the client side, so the database engine and administrators never see the plaintext. TDE encrypts data at rest but the engine processes plaintext during queries. Dynamic Data Masking hides data in query results but does not encrypt it.
Question 2
You need to ensure that your Azure Storage account data is available for read access even if the primary region goes down, without performing a failover. Which redundancy option should you choose?
A. LRS (Locally Redundant Storage)
B. GRS (Geo-Redundant Storage)
C. RA-GRS (Read-Access Geo-Redundant Storage)
D. ZRS (Zone-Redundant Storage)
Answer: C
RA-GRS provides read access to the secondary region at all times without requiring a failover. GRS also replicates to a secondary region but does not allow read access until failover. ZRS replicates within a single region across availability zones.
Question 3
Your company requires that data traffic between your virtual network and Azure SQL Database never traverses the public internet and the service has a private IP address in your VNet. What should you configure?
A. Virtual Network service endpoint
B. Private endpoint
C. Azure Firewall
D. Network Security Group (NSG)
Answer: B
Private endpoints assign a private IP address from your VNet to the Azure SQL Database, ensuring that traffic remains entirely within the private network. Service endpoints route traffic over the Azure backbone but the service retains a public IP address.
Question 4
You want Azure SQL Database to fail over automatically to a secondary region with a single read-write endpoint that does not change after failover. Which feature should you use?
A. Active geo-replication
B. Auto-failover groups
C. Azure Site Recovery
D. Azure Traffic Manager
Answer: B
Auto-failover groups provide a single read-write listener endpoint that automatically redirects after failover. Active geo-replication requires manual failover and connection string changes. Azure Site Recovery is designed for VM-level replication, not database-level.
Question 5
Your organization needs to discover and classify sensitive data across Azure SQL, Azure Data Lake, and on-premises SQL Server. Which service should you use?
A. Azure Policy
B. Azure Purview (Microsoft Purview)
C. Azure Monitor
D. Azure Advisor
Answer: B
Azure Purview (Microsoft Purview) provides unified data governance with automated discovery, classification, and labeling across multi-cloud and on-premises data sources. Azure Policy enforces organizational standards but does not perform data classification. Azure Monitor handles telemetry, not data governance.
AZ-305 Designing Azure Infrastructure Solutions - Table of Contents
Master all exam topics with comprehensive study guides and practice questions.