☸️

Kubernetes in Depth Advanced

Orchestrate containers at scale: the cluster architecture, core objects, networking, storage, scaling and self-healing.

20 lessons 60 quiz questions

📚 Lessons & quizzes

Each lesson ends with its own short quiz. Answer them as you go — score 90% across all lessons to earn your certificate.

1 Why container orchestration?

A single container is easy to run by hand. Running hundreds of containers across many machines — restarting the ones that crash, replacing the ones whose host dies, rolling out new versions without downtime, load-balancing traffic, and scaling up under load — is not. That coordination problem is what a container orchestrator solves.

Kubernetes (often abbreviated K8s, the 8 standing for the eight letters between K and s) is the dominant orchestrator. Originally built at Google and inspired by its internal Borg system, it was open-sourced in 2014 and is now governed by the Cloud Native Computing Foundation (CNCF). You tell Kubernetes the desired state of your workloads and it continuously works to make reality match.

2 What Kubernetes does for you

Kubernetes is a platform for deploying, scaling and managing containerised applications. The core capabilities it provides include:

Scheduling — placing your containers onto machines that have spare capacity.
Self-healing — restarting failed containers and rescheduling them when a node dies.
Horizontal scaling — running more (or fewer) copies as demand changes.
Service discovery and load balancing — giving groups of containers a stable address and spreading traffic across them.
Automated rollouts and rollbacks — shipping new versions gradually and reverting if something breaks.
Configuration and secret management — injecting settings and credentials without rebuilding images.

Crucially, Kubernetes is not a traditional PaaS: it does not build your code, dictate a language, or bundle a database. It orchestrates the containers you give it.

3 Cluster architecture: the big picture

A Kubernetes cluster is a set of machines (physical or virtual) divided into two roles. The control plane is the brain: it makes global decisions and detects and responds to events. The worker nodes run your actual application containers.

You interact with the cluster almost entirely through the API server, usually with the kubectl command-line tool. You submit objects describing what you want; the control plane stores them and drives the cluster toward that state, while each node reports back on what it is actually running.

# See the nodes in your cluster\nkubectl get nodes\n\n# See where the control-plane components run\nkubectl get pods -n kube-system

4 The control plane components

The control plane is made up of several cooperating components:

kube-apiserver — the front door. Every read and write to the cluster goes through this REST API; it validates requests and is the only component that talks to etcd.
etcd — a consistent, distributed key-value store that holds the entire cluster state. It is the single source of truth; losing it means losing the cluster’s memory.
kube-scheduler — watches for newly created Pods that have no node assigned and picks a suitable node for each, considering resource requests, constraints and affinity rules.
kube-controller-manager — runs the controllers: loops that watch the desired state and act to achieve it (e.g. the Node controller, the ReplicaSet controller).
cloud-controller-manager — integrates with a cloud provider’s APIs (load balancers, volumes, nodes).

5 Worker node components

Every worker node runs three things that turn it into part of the cluster:

kubelet — the node agent. It registers the node with the API server, watches for Pods assigned to it, and instructs the container runtime to start and stop containers so the node’s actual state matches what was assigned. It also reports Pod and node health back.
kube-proxy — maintains the network rules on the node that implement Services, forwarding traffic to the right Pods (using iptables or IPVS).
Container runtime — the software that actually runs containers, such as containerd or CRI-O. Kubernetes talks to it through the Container Runtime Interface (CRI). (Docker as a runtime was deprecated in 1.20 and removed in 1.24.)

6 The declarative model and reconciliation

Kubernetes is fundamentally declarative. Rather than issuing step-by-step commands (“start a container here, then there”), you describe the desired state in a manifest — for example, “I want 3 replicas of this app running.” Kubernetes stores that and then continuously compares it with the current state.

This loop is called reconciliation, performed by controllers. If a controller sees a gap — say only 2 of the 3 replicas are running — it acts to close it by creating another. This is why Kubernetes self-heals: the control loop never stops checking. The standard workflow is kubectl apply with a YAML file kept in version control.

kubectl apply -f deployment.yaml\nkubectl diff   -f deployment.yaml   # preview what would change

7 Pods: the smallest deployable unit

You do not deploy containers directly in Kubernetes — you deploy Pods. A Pod is the smallest deployable unit in Kubernetes and represents one instance of a running process. A Pod wraps one or more tightly coupled containers that share the same network namespace (so they share an IP address and can reach each other over localhost) and can share storage volumes.

Most Pods hold a single container; multiple containers in one Pod (the sidecar pattern) are used when helpers must live and die together with the main app. Pods are ephemeral and disposable: they are not healed in place but replaced, getting a new IP each time, which is why you rarely create bare Pods directly.

apiVersion: v1\nkind: Pod\nmetadata:\n  name: nginx\nspec:\n  containers:\n    - name: nginx\n      image: nginx:1.27\n      ports:\n        - containerPort: 80

8 Labels and selectors

Labels are key/value pairs attached to objects (Pods, Services, and more) for identification, such as app: web or tier: frontend. They carry no meaning to Kubernetes itself but let you organise and group objects however you like.

A selector matches objects by their labels. This is the glue of Kubernetes: a Service finds its Pods, and a Deployment manages its Pods, by label selection rather than by hard-coded names or IPs. Because Pods are ephemeral and constantly replaced, matching by label is far more robust than matching by identity.

# Pods carry labels\nmetadata:\n  labels:\n    app: web\n    tier: frontend\n\n# A selector matches them\nselector:\n  matchLabels:\n    app: web

9 ReplicaSets

A ReplicaSet ensures that a specified number of identical Pod replicas are running at any time. Its controller continuously counts the Pods matching its selector and creates or deletes Pods to maintain the desired count — this is the mechanism behind self-healing and basic horizontal scaling.

A ReplicaSet has three key fields: replicas (how many you want), a selector (which Pods it owns), and a template (the Pod spec to stamp out). In practice you rarely create ReplicaSets directly; you create a Deployment, which manages ReplicaSets for you. Understanding the ReplicaSet still matters, because it is the layer that actually keeps the Pod count correct.

10 Deployments, rolling updates and rollback

A Deployment is the object you actually use to run stateless applications. It manages ReplicaSets and provides declarative updates: change the image tag, apply, and Kubernetes performs a rolling update — it gradually spins up Pods with the new version while scaling down the old, so the app stays available throughout (governed by maxSurge and maxUnavailable).

Because the Deployment keeps a history of revisions, you can roll back to a previous version with a single command if a release misbehaves.

apiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: web\nspec:\n  replicas: 3\n  selector:\n    matchLabels:\n      app: web\n  template:\n    metadata:\n      labels:\n        app: web\n    spec:\n      containers:\n        - name: web\n          image: myapp:2.0\n# kubectl rollout status deployment/web\n# kubectl rollout undo   deployment/web

11 Services: stable addressing and discovery

Pods come and go and change IP addresses, so you cannot rely on a Pod’s IP. A Service gives a stable virtual IP and DNS name to a logical set of Pods (chosen by a label selector) and load-balances traffic across them. The main types are:

ClusterIP (default) — reachable only inside the cluster; ideal for internal service-to-service traffic.
NodePort — opens the same fixed port on every node so the Service is reachable from outside via nodeIP:nodePort.
LoadBalancer — provisions an external cloud load balancer that fronts the Service (builds on NodePort).

The Service uses its selector to track the set of healthy Pod IPs (its endpoints) automatically.

apiVersion: v1\nkind: Service\nmetadata:\n  name: web\nspec:\n  type: ClusterIP\n  selector:\n    app: web\n  ports:\n    - port: 80\n      targetPort: 8080

12 Ingress: HTTP routing into the cluster

Exposing every service with its own LoadBalancer is expensive and crude. Ingress is an API object that manages external HTTP and HTTPS access, providing host- and path-based routing, TLS termination, and a single entry point that fans out to many internal Services.

An Ingress resource is only a set of rules; it does nothing on its own. You must run an Ingress controller (such as ingress-nginx, Traefik, or a cloud controller) that watches Ingress objects and configures an actual proxy to enforce them.

apiVersion: networking.k8s.io/v1\nkind: Ingress\nmetadata:\n  name: site\nspec:\n  rules:\n    - host: shop.example.com\n      http:\n        paths:\n          - path: /api\n            pathType: Prefix\n            backend:\n              service:\n                name: api\n                port:\n                  number: 80

13 ConfigMaps and Secrets

Configuration should live outside your container image so the same image can run in dev, staging and production. Kubernetes provides two objects for this:

ConfigMap — holds non-confidential configuration as key/value pairs (URLs, feature flags, tuning values). You inject it into a Pod as environment variables or as files mounted into the container.
Secret — holds sensitive data such as passwords, tokens and TLS keys. Secrets are similar to ConfigMaps but intended for confidential data; their values are base64-encoded (which is encoding, not encryption). For real protection you enable encryption at rest in etcd and tighten RBAC.

apiVersion: v1\nkind: ConfigMap\nmetadata:\n  name: app-config\ndata:\n  LOG_LEVEL: info\n  API_URL: https://api.example.com

14 Namespaces and resource quotas

Namespaces partition a single cluster into multiple virtual clusters. They give a scope for object names (the same name can exist in different namespaces) and a boundary for access control and resource limits — useful for separating teams, environments or projects. Some objects (Pods, Services) are namespaced; others (Nodes, PersistentVolumes) are cluster-wide.

Within a namespace, a ResourceQuota caps the total resources (CPU, memory, object counts) that the namespace may consume, while a LimitRange sets default and maximum requests/limits per Pod or container. Together they stop one team from starving the cluster.

apiVersion: v1\nkind: ResourceQuota\nmetadata:\n  name: team-a-quota\n  namespace: team-a\nspec:\n  hard:\n    requests.cpu: \"4\"\n    requests.memory: 8Gi\n    pods: \"20\"

15 Storage: Volumes, PV, PVC and StorageClass

A container’s filesystem is lost when it restarts. Kubernetes Volumes give Pods storage whose lifetime can outlast a container restart; some types (like emptyDir) still die with the Pod, while others persist.

For durable storage Kubernetes separates supply from demand:

PersistentVolume (PV) — a piece of real storage in the cluster (a cloud disk, NFS share, etc.), provisioned by an admin or dynamically.
PersistentVolumeClaim (PVC) — a user’s request for storage of a given size and access mode. Kubernetes binds the claim to a matching PV, and the Pod mounts the PVC.
StorageClass — describes a “class” of storage and enables dynamic provisioning: when a PVC asks for that class, a PV is created on demand.

16 Health checks: liveness, readiness and startup probes

Kubernetes uses probes to monitor container health, each with a distinct job:

Liveness probe — answers “is this container still working?” If it fails, the kubelet restarts the container. It rescues apps stuck in a deadlock.
Readiness probe — answers “is this container ready to serve requests?” If it fails, the Pod is removed from Service endpoints so no traffic is sent to it — but the container is not restarted. It gates traffic during warm-up or temporary overload.
Startup probe — protects slow-starting apps: until it succeeds, liveness and readiness checks are held off, so a slow boot is not mistaken for a failure.

readinessProbe:\n  httpGet:\n    path: /healthz\n    port: 8080\n  initialDelaySeconds: 5\n  periodSeconds: 10

17 Resource requests and limits

Each container can declare how much CPU and memory it needs. These two values mean different things:

requests — the amount guaranteed to the container. The scheduler uses requests to decide which node has room; a Pod is only placed where its requests fit.
limits — the maximum a container may use. A container exceeding its memory limit is killed (OOMKilled); exceeding its CPU limit is throttled rather than killed.

CPU is measured in cores or millicores (500m = half a core); memory in bytes with suffixes like Mi and Gi. The relationship between requests and limits also determines a Pod’s Quality of Service class, which influences what gets evicted first under pressure.

resources:\n  requests:\n    cpu: 250m\n    memory: 256Mi\n  limits:\n    cpu: \"1\"\n    memory: 512Mi

18 Horizontal Pod Autoscaling

The HorizontalPodAutoscaler (HPA) automatically changes the number of Pod replicas in a Deployment (or ReplicaSet/StatefulSet) based on observed metrics — classically average CPU utilisation, but also memory or custom and external metrics.

The HPA controller periodically compares the current metric against a target and adjusts replicas to push the metric toward that target, staying within a configured minReplicas and maxReplicas. This is horizontal scaling (more Pods), distinct from vertical scaling (bigger Pods). It needs a metrics source such as the metrics-server installed in the cluster. Note that requests must be set for CPU-percentage targets to be meaningful.

kubectl autoscale deployment web \\n  --cpu-percent=70 --min=2 --max=10

19 Workload controllers: DaemonSet, StatefulSet, Job

Deployments suit stateless apps, but Kubernetes offers other controllers for other shapes of workload:

DaemonSet — ensures a copy of a Pod runs on every (or every matching) node. Perfect for node-level agents such as log collectors, monitoring agents and network plugins.
StatefulSet — for stateful apps that need stable, unique network identities and stable persistent storage per replica, with ordered, graceful deployment and scaling. Pods get predictable names like db-0, db-1. Used for databases and clustered systems.
Job — runs a Pod to completion (a batch task) and tracks success; a CronJob runs Jobs on a schedule, like a cluster-wide cron.

20 Helm: the package manager at a glance

Real applications are made of many manifests — Deployments, Services, ConfigMaps, Ingress and more. Maintaining them by hand across environments is tedious. Helm is the de-facto package manager for Kubernetes.

A Helm package is a chart: a bundle of templated manifests plus a values.yaml of default settings. You install a chart to produce a release (a named, versioned instance running in the cluster), overriding values per environment. Helm supports upgrade and rollback of releases, and public charts let you deploy complex software with one command.

helm repo add bitnami https://charts.bitnami.com/bitnami\nhelm install my-db bitnami/postgresql \\n  --set auth.username=app\nhelm upgrade my-db bitnami/postgresql\nhelm rollback my-db 1

🎓 Certificate of Completion

🔒 Complete every lesson quiz above with 90%+ to unlock your downloadable certificate.