🔷

Microsoft Azure Advanced

Architect on Azure: high availability, hybrid networking, security, data services, governance and cost optimisation.

19 lessons 57 quiz questions
Lessons & quizzes Certificate

📚 Lessons & quizzes

Each lesson ends with its own short quiz. Answer them as you go — score 90% across all lessons to earn your certificate.

1 The Azure Well-Architected Framework

The Azure Well-Architected Framework (WAF) is Microsoft’s set of guiding tenets for building high-quality cloud workloads. It is organised around five pillars, each capturing a distinct quality attribute that must be balanced against the others.

  • Reliability — the workload meets its availability and recovery targets and survives failures gracefully.
  • Security — confidentiality, integrity and availability of data and systems through defense in depth.
  • Cost Optimization — getting the most value per spend and eliminating waste.
  • Operational Excellence — DevOps practices, observability and automation that keep the system running.
  • Performance Efficiency — using resources efficiently and scaling to match demand.

The pillars involve trade-offs: adding redundancy improves reliability but raises cost; tighter security can reduce performance. The framework gives a structured review (the WAF assessment) plus design principles, checklists and recommendations per pillar so architects make those trade-offs deliberately rather than by accident.

2 Availability Zones and high availability design

An Azure region is a set of datacentres deployed within a latency-defined perimeter. Many regions contain Availability Zones (AZs) — physically separate locations within the region, each with independent power, cooling and networking. A region with zones has at least three of them.

Spreading instances across zones protects a workload from a datacentre-level failure. There are two ways services use zones:

  • Zonal — a resource is pinned to a specific zone (for example a VM placed in zone 2). You control placement and align dependent resources.
  • Zone-redundant — the platform automatically spreads the resource (or its replicas) across zones, such as zone-redundant storage or a zone-redundant load balancer.

To reach the higher VM SLA you must deploy two or more instances across zones behind a load balancer. A single VM, however reliable, gives only a single-instance SLA. Zones reduce correlated failure but stay within one region, so they do not protect against a whole-region outage — that needs multi-region design.

3 Multi-region and disaster recovery: RTO, RPO and paired regions

Disaster recovery (DR) planning starts with two objectives. The Recovery Time Objective (RTO) is the maximum acceptable time to restore service after a disaster. The Recovery Point Objective (RPO) is the maximum acceptable amount of data loss, measured as a time window of data that may be lost.

Azure offers region pairs: most regions are paired with another region in the same geography. Pairing brings sequenced platform updates (only one region in a pair is updated at a time), prioritised recovery during a broad outage, and is the basis for geo-redundant storage replication.

Azure Site Recovery (ASR) orchestrates replication, failover and failback of VMs and physical servers to a secondary region. It continuously replicates disks, lets you build recovery plans that sequence multi-tier failover, and supports non-disruptive test failovers so you can validate DR without affecting production.

Lower RTO/RPO generally means an active-active or warm-standby design and higher cost; relaxed targets allow cheaper cold-standby approaches.

az site-recovery vault create \
  --resource-group rg-dr \
  --name asr-vault \
  --location westeurope

4 Defense in depth and security best practices

Defense in depth layers controls so that if one fails, others still protect the workload. Azure frames these layers from the outside in:

  • Physical — datacentre security, handled by Microsoft.
  • Identity & access — Microsoft Entra ID, MFA, least privilege.
  • Perimeter — DDoS Protection and edge filtering.
  • Network — segmentation with virtual networks, NSGs and private endpoints.
  • Compute — patching, endpoint protection, just-in-time VM access.
  • Application — secure coding, secrets management, WAF.
  • Data — encryption at rest and in transit, classification.

Two principles run through every layer: least privilege (grant only the access needed, for as long as needed) and assume breach (design as if an attacker is already inside, so blast radius is contained). Microsoft Defender for Cloud continuously assesses posture and produces a Secure Score with prioritised remediation, while Microsoft Sentinel adds SIEM/SOAR for detection and response.

5 Encryption, Key Vault and managed HSM

Azure encrypts data at rest and in transit by default. At rest, services use Storage Service Encryption and similar mechanisms, by default with Microsoft-managed keys. For control you can supply customer-managed keys (CMK) stored in Azure Key Vault, enabling your own rotation and revocation.

Azure Key Vault centralises three secret types: keys (cryptographic keys for encryption/signing), secrets (passwords, connection strings, API tokens) and certificates. Access is governed by Azure RBAC or vault access policies, and every operation is logged.

Key Vault standard and premium back keys with software or shared FIPS 140-2 Level 2/3 hardware. When you need a single-tenant, dedicated hardware boundary with full key sovereignty, use Azure Key Vault Managed HSM, which provides FIPS 140-2 Level 3 validated hardware security modules under your sole control.

A common pattern is envelope encryption: a data encryption key (DEK) encrypts the data, and a key encryption key (KEK) in Key Vault or Managed HSM wraps the DEK — so rotating the KEK does not require re-encrypting all the data.

az keyvault key create \
  --vault-name kv-prod \
  --name cmk-storage \
  --protection hsm

6 Network security: Azure Firewall, WAF, DDoS and Private Link

Azure layers several network defences:

  • Azure Firewall — a managed, stateful, cloud-native network firewall with built-in high availability and scaling. It filters by network rules (IP/port/protocol), application rules (FQDN), and offers threat intelligence-based filtering. It is typically deployed in a hub virtual network.
  • Web Application Firewall (WAF) — protects HTTP/S apps against OWASP Top 10 threats (SQL injection, XSS). It runs on Application Gateway or Azure Front Door, not on the network firewall.
  • Azure DDoS Protection — the Network/IP Protection tiers add adaptive tuning, attack telemetry and cost protection on top of the always-on basic platform defence.
  • Azure Private Link — exposes a PaaS service (such as Storage or SQL) through a private endpoint, a private IP inside your VNet, so traffic never traverses the public internet.

A frequent confusion: a service endpoint extends your VNet identity to a service over the Azure backbone but the service keeps a public IP; a private endpoint (Private Link) gives the service a private IP in your subnet. Private Link is generally preferred for the strongest isolation.

7 Hybrid connectivity: VPN Gateway and ExpressRoute

Connecting on-premises networks to Azure has two main options:

  • VPN Gateway — establishes an encrypted IPsec/IKE tunnel over the public internet (site-to-site), or a point-to-site VPN for individual clients. It is quick to set up and relatively low cost, but bandwidth and latency depend on the internet path.
  • ExpressRoute — a private, dedicated connection through a connectivity provider that does not traverse the public internet. It offers higher bandwidth (up to many Gbps), more consistent latency, and an availability SLA. ExpressRoute traffic is private but not encrypted by default; you can layer encryption (such as MACsec or an IPsec tunnel) if required.

For resilience, organisations often run ExpressRoute as the primary path with a VPN Gateway as failover. ExpressRoute peerings include private peering (to VNets) and Microsoft peering (to Microsoft 365 and Azure public services). To connect many VNets and branches at scale, Azure Virtual WAN provides a managed hub-based backbone integrating both VPN and ExpressRoute.

8 Azure Synapse Analytics: the cloud data warehouse

Azure Synapse Analytics is an integrated analytics platform that unifies enterprise data warehousing and big-data processing. It brings several engines together under one workspace:

  • Dedicated SQL pool — a massively parallel processing (MPP) data warehouse where compute is provisioned in Data Warehouse Units (DWUs). Data is distributed across 60 underlying storage distributions, and you choose hash, round-robin or replicated distribution per table.
  • Serverless SQL pool — query data directly in the data lake (Parquet/CSV/JSON) with pay-per-query pricing, no infrastructure to provision.
  • Apache Spark pools — for data engineering, machine learning and large-scale transformation.
  • Pipelines — Data Factory-based orchestration for ingest and ETL/ELT.

Choosing the right distribution matters for performance: hash distribution on a high-cardinality join key minimises data movement for large fact tables; round-robin spreads rows evenly for staging; replicated copies a small dimension table to every node to avoid shuffles. Synapse separates storage from compute, so you can pause the dedicated pool to save cost when idle.

9 Event-driven architecture: Event Grid and Event Hubs

Azure offers distinct messaging services for distinct shapes of communication. Two are central to event-driven design:

  • Azure Event Grid — a reactive, push-based event router for discrete events (for example, a blob was created). Publishers send events; Event Grid filters and routes them to subscribers (Functions, Logic Apps, webhooks) with at-least-once delivery and dead-lettering. It is ideal for serverless reactions to state changes.
  • Azure Event Hubs — a high-throughput event streaming platform for continuous telemetry and big data (millions of events per second). Data is partitioned and retained for a window, and consumers read at their own pace using offsets and checkpoints. It is Kafka-protocol compatible.

The distinction is event vs. stream: Event Grid notifies you that something happened and routes that single notification; Event Hubs ingests a firehose of events for stream processing and analytics. Azure Service Bus is the third sibling — a broker for reliable commands/messages with queues, topics, ordering and transactions. Choose Service Bus for enterprise messaging, Event Grid for reactive routing, Event Hubs for streaming.

10 Azure Cache for Redis

Azure Cache for Redis is a fully managed, in-memory data store based on the open-source Redis engine. By keeping hot data in memory it delivers sub-millisecond latency and offloads pressure from backing databases. Common patterns include:

  • Cache-aside (lazy loading) — the app checks the cache first; on a miss it reads the database and populates the cache.
  • Session store — externalise web session state so any stateless app instance can serve any user.
  • Distributed lock and pub/sub — coordinate across instances and broadcast messages.
  • Rate limiting and leaderboards — using atomic counters and sorted sets.

Tiers range from Basic/Standard/Premium (classic Redis) to Enterprise and Enterprise Flash (Redis Enterprise with modules and active geo-replication). Premium and Enterprise add clustering for horizontal scale, persistence, zone redundancy and VNet/Private Link integration. A key design rule: caches are for transient, regenerable data — always have a strategy for cache misses and set sensible TTLs to avoid serving stale values.

11 Azure API Management

Azure API Management (APIM) is a façade in front of your backend APIs that decouples consumers from implementations. It is composed of three planes:

  • Gateway — accepts calls, applies policies, and forwards to backends.
  • Management plane — for configuring APIs, products and policies.
  • Developer portal — auto-generated documentation and self-service onboarding.

The power of APIM is its policy engine: declarative XML statements applied in inbound, backend, outbound and on-error pipelines. Policies implement rate limiting and quotas, JWT validation and OAuth, IP filtering, caching, request/response transformation, and mocking. APIs are grouped into products that gate access via subscription keys.

Tiers include Consumption (serverless, pay-per-call), Developer (non-SLA test), Basic/Standard/Premium (Premium adds multi-region deployment, VNet integration and zone redundancy). APIM is a cornerstone for microservices, providing a single, secured, observable entry point.

az apim create \
  --name apim-prod \
  --resource-group rg-api \
  --publisher-email ops@contoso.com \
  --publisher-name Contoso \
  --sku-name Premium

12 Microservices on Azure Kubernetes Service (AKS)

Azure Kubernetes Service (AKS) is a managed Kubernetes offering: Microsoft runs and patches the control plane for free, and you pay for the worker node VMs that run your pods. Nodes are organised into node pools (for example a system pool plus user pools, optionally spot pools for cheap interruptible work).

Key architectural building blocks:

  • Ingress — an ingress controller or the managed application routing add-on exposes services; pair it with WAF on Front Door or Application Gateway (AGIC) for HTTP protection.
  • Scaling — the Horizontal Pod Autoscaler scales pods, the Cluster Autoscaler adds nodes, and KEDA scales on event sources such as queue length.
  • IdentityWorkload Identity federates Kubernetes service accounts with Microsoft Entra ID so pods access Azure resources without secrets.
  • Resilience — spread node pools across availability zones and use pod disruption budgets.

A service mesh (such as the Istio-based add-on) can add mTLS, traffic shifting and observability between microservices without changing application code.

13 Observability at scale: Monitor, Application Insights and workbooks

Azure Monitor is the unified observability platform. All telemetry lands in two store types: metrics (lightweight numeric time series, near real-time) and logs (rich, queryable events stored in a Log Analytics workspace). Logs are queried with Kusto Query Language (KQL).

  • Application Insights — the application performance management (APM) layer. It auto-collects requests, dependencies, exceptions and traces, and supports distributed tracing across microservices with correlated operation IDs.
  • Workbooks — interactive, parameterised reports combining metrics, logs and visualisations into shareable dashboards.
  • Alerts — metric, log and activity-log alerts fire action groups (email, SMS, webhook, Logic App, ITSM) when conditions are met.

At scale, consolidate workspaces thoughtfully, set per-table retention and archive tiers to manage cost, and use the Application Map in Application Insights to visualise the topology and latency of dependencies. The three pillars of observability — metrics, logs and traces — are all represented, letting you move from a symptom to a root cause quickly.

AppRequests
| where TimeGenerated > ago(1h)
| summarize p95 = percentile(DurationMs, 95) by Name
| order by p95 desc

14 Governance with Azure Policy and Blueprints

Azure Policy enforces organisational standards at scale. A policy definition describes a condition and an effect; assigning it to a scope (management group, subscription or resource group) makes it active. Common effects include:

  • Deny — block non-compliant resource creation (for example, disallow public IPs).
  • Audit — flag non-compliance without blocking.
  • Append / Modify — add or change properties (such as required tags).
  • DeployIfNotExists — remediate by deploying a missing configuration (for example, enabling diagnostics).

Policies are grouped into initiatives (policy sets) to manage many rules as one unit, often aligned to compliance frameworks. Existing resources are evaluated for compliance and can be fixed with remediation tasks.

Azure Blueprints package policies, RBAC role assignments, ARM templates and resource groups into a repeatable, versioned environment definition — useful for stamping out compliant landing zones. (Microsoft now points new work toward Template Specs and the Azure landing zone Bicep accelerators, but the Blueprint concept of bundling governance artefacts remains the exam-relevant model.)

15 Cost optimisation: Reservations, Savings Plans and Advisor

Azure gives several levers to cut spend below pay-as-you-go rates:

  • Reservations — commit to a specific resource type (for example a VM series in a region) for 1 or 3 years for steep discounts. Best when usage is stable and predictable; less flexible if your needs change.
  • Savings Plans for compute — commit to a fixed hourly spend (not a specific SKU) for 1 or 3 years. More flexible than reservations across instance families and regions, though typically a slightly smaller discount.
  • Azure Hybrid Benefit — reuse on-premises Windows Server and SQL Server licences with Software Assurance to avoid paying for them again in the cloud.
  • Spot VMs — deeply discounted surplus capacity that can be evicted; ideal for fault-tolerant, interruptible batch work.

Azure Advisor analyses usage and recommends actions across cost, reliability, security, operational excellence and performance — for example, right-sizing or shutting down idle VMs. Combine Advisor with Cost Management budgets, tagging and analysis to drive a continuous FinOps practice.

16 Identity federation, Conditional Access and PIM

Modern Azure identity is built on Microsoft Entra ID (formerly Azure Active Directory). Several capabilities work together to secure access:

  • Federation — trust between Entra ID and an external identity provider so users authenticate once and access many systems. Standards include SAML 2.0, OpenID Connect and OAuth 2.0. With federation, the home directory authenticates the user and issues tokens trusted by Azure.
  • Conditional Access — an if-then policy engine: based on signals (user, group, device state, location, risk level) it makes decisions (grant, block, or require MFA / compliant device). It is the heart of a Zero Trust enforcement point.
  • Privileged Identity Management (PIM) — provides just-in-time, time-bound, approval-gated eligible role activation instead of standing administrative access. Activations require justification and can demand MFA, with full audit trails.

Together these implement Zero Trust: verify explicitly, use least-privilege just-in-time access, and assume breach. PIM specifically shrinks the window in which powerful roles are active, dramatically reducing the attack surface of standing privilege.

17 Autoscaling architectures

Autoscaling adapts capacity to load, improving both cost efficiency and performance. Azure distinguishes two directions:

  • Scale out / in (horizontal) — add or remove instances. This is the cloud-preferred approach because it has no hard ceiling and improves resilience.
  • Scale up / down (vertical) — move to a larger or smaller SKU. Simple but bounded and often requires a restart.

Compute platforms expose autoscale differently: Virtual Machine Scale Sets (VMSS) and App Service use Azure Monitor autoscale rules driven by metrics (CPU, queue length) or schedules; AKS uses HPA/Cluster Autoscaler/KEDA; serverless Functions scale automatically per event.

Good autoscale design follows patterns: scale out aggressively but in conservatively (to avoid flapping), use cooldown periods, scale on a leading signal such as queue depth rather than only CPU, and combine reactive metric rules with predictive or scheduled scaling for known peaks. The queue-based load levelling pattern places a queue between producers and consumers so spikes are smoothed and consumers scale on backlog.

18 Serverless event patterns

Serverless lets you build event-driven systems without managing servers, paying only for execution. The core compute is Azure Functions, where a trigger starts a function and bindings declaratively connect inputs and outputs (queues, blobs, Cosmos DB, Event Grid) without boilerplate SDK code.

Useful serverless patterns:

  • Fan-out / fan-in — split work into parallel units then aggregate results. Durable Functions orchestrations express this in code with the orchestrator/activity model.
  • Function chaining — run a sequence of functions, passing output to the next, via a durable orchestrator.
  • Async HTTP APIs — return 202 with a status endpoint for long-running work.
  • Event routing — Event Grid invokes a function on each discrete event for near-real-time reactions.

Hosting plans matter: the Consumption plan scales to zero and bills per execution but can incur cold starts; the Premium plan keeps pre-warmed instances and adds VNet integration; the Flex Consumption plan blends elastic scale with always-ready instances. Choose based on latency sensitivity and traffic profile. Logic Apps complement Functions for low-code orchestration and connectors.

19 Traffic Manager and global routing

Distributing users across regions needs global load balancing. Azure offers complementary services that operate at different layers:

  • Azure Traffic ManagerDNS-based global routing. It returns the best endpoint by a routing method, but because it works at DNS it does not see the actual traffic and relies on client DNS caching/TTL.
  • Azure Front Door — a global Layer-7 reverse proxy and CDN with TLS offload, WAF, caching and instant failover at the connection level (anycast). Preferred for HTTP/S apps needing edge acceleration.
  • Azure Load Balancer (Layer 4) and Application Gateway (Layer 7) handle regional distribution within a region.

Traffic Manager routing methods include Priority (active-passive failover), Weighted (split by proportion, useful for canary), Performance (lowest-latency region for the user), Geographic (route by user location for compliance/data residency), Multivalue and Subnet. A common global design layers Front Door for HTTP edge in front of regional Application Gateways and zone-redundant backends, with Traffic Manager used where DNS-level or non-HTTP routing is required.

🎓 Certificate of Completion

🔒 Complete every lesson quiz above with 90%+ to unlock your downloadable certificate.