🔷

Microsoft Azure Professional

Design Azure at enterprise scale: landing zones, governance, security & compliance, DR, FinOps and migration (AZ-305 scope).

21 lessons 63 quiz questions
Lessons & quizzes Certificate

📚 Lessons & quizzes

Each lesson ends with its own short quiz. Answer them as you go — score 90% across all lessons to earn your certificate.

1 The Cloud Adoption Framework

The Cloud Adoption Framework (CAF) is Microsoft’s end-to-end guidance for moving an organisation to Azure. It is organised into sequential and ongoing methodologies: Strategy (motivations and business outcomes), Plan (digital estate and skilling), Ready (the landing zone), Adopt (Migrate and Innovate), and the continuous disciplines of Govern and Manage.

CAF is deliberately prescriptive: it tells you what to do and in roughly what order. The Well-Architected Framework, by contrast, helps you assess the quality of an individual workload. An architect uses CAF to shape the platform and WAF to refine the workloads that run on it.

2 Enterprise-scale landing zones

A landing zone is a pre-provisioned, governed environment that workloads are deployed into. The enterprise-scale landing zone (Azure landing zone) architecture provides an opinionated, conceptual reference: a set of platform landing zones (identity, management, connectivity) and application landing zones, all bound together by a management group hierarchy with policy applied at the top.

The design principles include subscription democratisation (subscriptions as a unit of management and scale), policy-driven governance (guardrails enforced by Azure Policy rather than gates), and a clear separation between the platform team that owns the foundation and the application teams that own their workloads.

3 Management group and subscription design

Management groups form a hierarchy above subscriptions, letting you apply Azure Policy and RBAC assignments that inherit downward. A tenant has a single non-removable root management group; the enterprise-scale reference places intermediate groups such as Platform, Landing zones (with Corp and Online children), Sandbox and Decommissioned beneath it.

Subscriptions are the primary boundary for scale, billing and policy. Design them around lifecycle, ownership and quota limits rather than cramming everything into one. Avoid mirroring your org chart too literally; design for management and blast-radius isolation. A management group tree can be up to six levels deep, excluding the root and subscription levels.

4 Centralised identity with Entra ID

Microsoft Entra ID (formerly Azure AD) is the cloud identity and access plane. Each Azure tenant maps to one Entra directory that authenticates users and service principals and issues tokens. At enterprise scale you typically run a single production tenant as the identity authority, optionally with separate tenants for true isolation (e.g. a dev/test tenant), accepting that cross-tenant management adds complexity.

Hybrid identity is achieved with Microsoft Entra Connect (or cloud sync) to synchronise on-premises Active Directory. Authentication options include password hash synchronisation, pass-through authentication, and federation (e.g. AD FS) for SSO. Managed identities and service principals give workloads their own identities, removing the need for stored credentials.

5 B2B, B2C and federation

Entra External ID covers identity for people outside your organisation. B2B collaboration invites external guest users into your tenant so partners can access shared apps and resources while keeping their home identity; access is governed with Conditional Access, entitlement management and access reviews.

Azure AD B2C (and its successor, External ID for customers/CIAM) is a separate customer-facing identity solution: it provides sign-up/sign-in for consumer applications with custom user journeys and supports social and federated identity providers (Google, Facebook, SAML/OIDC) so customers bring their existing accounts. The key distinction is workforce/partner (B2B) versus customer (B2C/CIAM).

6 Governance and policy-as-code

Azure Policy enforces organisational rules over resources. Policies use effects such as Deny (block non-compliant creation), Audit (flag without blocking), DeployIfNotExists and Modify (remediate automatically), and Append. Grouping many policies into initiatives (policy sets) lets you assign a coherent guardrail set at a management group and report compliance centrally.

Treating governance as code means defining policies and assignments in Bicep/Terraform and deploying them through pipelines so they are versioned and reviewable. Azure Blueprints (now being deprecated in favour of Template Specs and deployment stacks) historically packaged role assignments, policies and ARM templates as a repeatable artefact for landing zones.

# Assign a built-in policy that audits VMs without managed disks
az policy assignment create \
  --name 'audit-managed-disks' \
  --scope "/providers/Microsoft.Management/managementGroups/contoso-platform" \
  --policy '06a78e20-9358-41c9-923c-fb736d382a4d'

7 Security with Microsoft Defender for Cloud

Microsoft Defender for Cloud is the cloud-native application protection platform (CNAPP) for Azure and multicloud. Its free Cloud Security Posture Management (CSPM) layer continuously assesses resources against the Microsoft Cloud Security Benchmark and produces a secure score with prioritised recommendations.

Its paid Defender plans add Cloud Workload Protection (CWPP) for servers, containers, databases, storage, key vault and app services, generating security alerts on threats. Architects enable Defender at the subscription or management group level so coverage follows new workloads automatically, and route alerts into a SIEM for response.

8 Microsoft Sentinel and Zero Trust

Microsoft Sentinel is Azure’s cloud-native SIEM and SOAR. Built on a Log Analytics workspace, it ingests signals via data connectors, detects threats with analytics rules and machine learning, and automates response with playbooks (Logic Apps). Defender for Cloud and Defender XDR feed Sentinel for unified investigation.

Zero Trust is the guiding security model: verify explicitly, use least-privilege access, and assume breach. In Azure this manifests as Conditional Access and MFA on every request, Privileged Identity Management (PIM) for just-in-time elevation, micro-segmented networks, private endpoints, and pervasive logging so that no implicit trust is granted by network location alone.

9 Hub-and-spoke and Virtual WAN topology

The hub-and-spoke topology places shared services — firewall, gateways, DNS — in a central hub virtual network, with workload spokes connected by VNet peering. Spokes do not peer with each other; traffic that must cross between them is routed through the hub (often via Azure Firewall) using user-defined routes, giving central inspection and policy.

Azure Virtual WAN is the managed, hub-based alternative for large or global estates. Microsoft manages the hubs and the any-to-any routing between VNets, branches (VPN/ExpressRoute) and users, scaling connectivity without you hand-building peering and route tables. Choose traditional hub-and-spoke for full control in one or few regions; choose Virtual WAN when you need global transit and managed scale.

10 Private DNS at scale

Resolving private endpoints requires that names like myacct.blob.core.windows.net resolve to a private IP. This is done with Azure Private DNS zones (e.g. privatelink.blob.core.windows.net) linked to the virtual networks that must resolve them. At enterprise scale you centralise these zones in the connectivity (hub) subscription and link spokes to them.

To avoid linking every zone to every VNet, modern designs use Azure DNS Private Resolver with conditional forwarding and a central DNS forwarding rule set, plus Azure Policy with DeployIfNotExists to automatically create the correct privatelink DNS records whenever a private endpoint is provisioned. This keeps resolution consistent and self-healing across hundreds of spokes.

11 Disaster recovery and business continuity

A continuity strategy is driven by two targets: the Recovery Time Objective (RTO) — how quickly you must be back online — and the Recovery Point Objective (RPO) — how much data loss is acceptable. These business requirements dictate the architecture and cost.

Azure tools include Azure Site Recovery for orchestrated failover of VMs to a secondary region, Azure Backup for point-in-time restore, geo-redundant storage (GRS/GZRS) and database geo-replication. Architects map each workload to a DR pattern — backup-and-restore, pilot light, warm standby, or active-active — and crucially test failover regularly, because an untested DR plan is an assumption, not a capability.

12 FinOps and cost governance

FinOps brings financial accountability to cloud spending through the cycle of Inform, Optimise, Operate. In Azure, Microsoft Cost Management provides cost analysis, exports and budgets with alerts. Consistent tagging (cost centre, owner, environment) and policy enforcement of those tags enable showback and chargeback so teams see and own their spend.

Commitment-based discounts are a core lever: Reservations and savings plans trade a 1- or 3-year commitment for lower rates on steady-state compute, while the Azure Hybrid Benefit reuses on-prem licences. Architects build a reservation strategy on baseline usage, keep burst capacity on pay-as-you-go or spot, and continuously act on Azure Advisor right-sizing recommendations.

13 Migration strategies and Azure Migrate

Migration planning starts with the digital estate and the five (or six) Rs: Rehost (lift-and-shift), Replatform (lift-and-optimise, e.g. VM to managed PaaS database), Refactor/Re-architect (modernise to PaaS/containers/serverless), Rebuild, and Replace (SaaS). Retire and Retain round out the decisions for what not to move.

Azure Migrate is the central hub: it discovers and assesses on-premises servers, databases and web apps (right-sizing, cost and Azure readiness), then drives migration of VMs, databases and apps. Rehost is fastest to value; refactoring yields the most cloud benefit but costs more effort — architects sequence waves accordingly.

14 Large-scale data platform

An enterprise analytics platform separates storage from compute. Azure Data Lake Storage Gen2 (hierarchical namespace on Blob) is the scalable lake; Azure Data Factory or Synapse pipelines orchestrate ingestion and ELT; Azure Databricks/Synapse Spark and the medallion architecture (bronze, silver, gold) progressively refine data; and Synapse/Fabric or dedicated SQL pools serve curated models to BI.

Modern designs favour the lakehouse (open Delta/Parquet tables on the lake) and increasingly a data mesh, where domain teams own their data products on a shared self-serve platform. Governance is provided by Microsoft Purview for cataloguing, lineage and classification across the estate.

15 Multi-region active-active

For the highest availability, workloads run active-active across two or more regions, serving live traffic from each. Global routing is handled by Azure Front Door (layer 7, with WAF and caching) or Traffic Manager (DNS-based), distributing users to the nearest healthy region and failing over automatically.

The hard part is state. Options include Cosmos DB with multi-region writes and tunable consistency, SQL Database active geo-replication or auto-failover groups, and geo-replicated storage. Architects must weigh consistency versus latency, design for idempotency and conflict resolution, and decide which data is truly global versus region-local. Active-active maximises RTO/RPO but multiplies cost and complexity.

16 Secrets and key management at scale

Azure Key Vault centralises secrets, keys and certificates. Workloads should read secrets at runtime using a managed identity rather than embedding credentials, and certificates should auto-rotate. For strict isolation and higher throughput, Key Vault Managed HSM provides FIPS 140-2 Level 3 hardware-backed keys, and customer-managed keys (CMK) let you control encryption keys for storage, disks and databases.

At scale, design vaults per environment and trust boundary (not one giant vault), apply RBAC and network restrictions (private endpoints), enable soft-delete and purge protection, and monitor access. Bring secrets into Kubernetes with the Secrets Store CSI driver and into pipelines via workload identity federation rather than long-lived service principal secrets.

17 SRE and observability strategy

Site Reliability Engineering (SRE) makes reliability a measurable, engineered property. Teams define Service Level Indicators (SLIs), set Service Level Objectives (SLOs), and spend the resulting error budget deliberately: when the budget is exhausted, work shifts from features to reliability.

Observability in Azure is built on the Azure Monitor stack: Log Analytics (KQL queries over logs), Application Insights (distributed tracing and APM), metrics, and alerts/action groups. Centralise telemetry into a shared workspace, instrument with OpenTelemetry, and combine the three pillars — metrics, logs and traces — so on-call engineers can move from a symptom to a root cause quickly.

18 Well-Architected reviews

The Azure Well-Architected Framework (WAF) evaluates a workload across five pillars: Reliability, Security, Cost Optimisation, Operational Excellence and Performance Efficiency. A WAF review walks a workload against the pillars (often via the Well-Architected Review assessment and Azure Advisor) to surface risks and trade-offs.

The pillars are intentionally in tension — more reliability or security can raise cost; aggressive cost cuts can hurt performance. The architect’s job is to make these trade-offs explicit and intentional against business priorities, not to maximise every pillar. Reviews are iterative: re-run them as the workload and requirements evolve.

19 Quota and capacity management

Azure enforces subscription and region quotas (limits) on resources such as vCPU cores per VM family, public IPs and network resources. Many are soft limits raised via a support request; some are hard. At scale you must track usage against quota proactively — using Azure Quota APIs and alerts — so deployments and failovers do not fail for lack of capacity.

For guaranteed capacity in a region, use On-Demand Capacity Reservations (which reserve compute independently of price discounts) and availability zones for resilience. Capacity planning for DR is critical: ensure the secondary region actually has quota and capacity to absorb a full failover, ideally validated by reservations, rather than assuming it will be available when you need it.

20 Building a platform team

Enterprise-scale Azure depends on a clear operating model. A central platform (or cloud) team owns the landing zones, identity, networking, governance and shared tooling, exposing them as a self-service platform to application teams. This is the essence of platform engineering: reduce cognitive load by providing paved paths (templates, pipelines, golden modules) rather than gatekeeping every change.

Effective models distribute responsibility: the platform team sets guardrails via policy; a Cloud Centre of Excellence (CCoE) defines standards and best practice; and application teams retain ownership of their workloads within those guardrails. RBAC and PIM enforce least privilege, while Infrastructure as Code and pipelines make the platform repeatable and auditable.

21 The AZ-305 exam domains

The AZ-305: Designing Microsoft Azure Infrastructure Solutions exam is the design half of the Azure Solutions Architect Expert certification (paired with the AZ-104 administrator prerequisite). It tests design decisions, not button-clicking.

Its skill domains are: Design identity, governance and monitoring solutions (Entra ID, RBAC, Azure Policy, management groups, Monitor); Design data storage solutions (relational and non-relational, integration, data protection); Design business continuity solutions (backup, recovery, high availability); and Design infrastructure solutions (compute, application architecture, networks, migration). Each question expects you to weigh requirements and constraints and choose the most appropriate, cost-effective design.

🎓 Certificate of Completion

🔒 Complete every lesson quiz above with 90%+ to unlock your downloadable certificate.