🔧

Configuration Management Intermediate

Keep servers consistent and repeatable with Ansible: inventories, playbooks, idempotent tasks, roles and templating.

18 lessons 54 quiz questions
Lessons & quizzes Certificate

📚 Lessons & quizzes

Each lesson ends with its own short quiz. Answer them as you go — score 90% across all lessons to earn your certificate.

1 What configuration management is

Configuration management (CM) is the practice of defining, applying and maintaining the desired state of your systems — packages, services, files, users and settings — in a controlled, repeatable way. Instead of logging into each server and tweaking it by hand, you describe what the server should look like, and a tool brings every machine into line with that description.

The goal is consistency at scale. When you manage two servers, manual changes are merely tedious; when you manage two hundred, manual changes are unmanageable and error-prone. CM lets one definition govern many machines, so the tenth server is configured exactly like the first.

CM is a cornerstone of infrastructure as code: the configuration lives in text files, is stored in version control, is reviewed like software, and can be re-applied at any time to reproduce a known-good state.

2 Config drift and snowflake servers

Two problems motivate configuration management. The first is configuration drift: over time, servers that started identical diverge. A quick manual hotfix here, a forgotten package upgrade there, an emergency edit at 3 a.m. — and gradually no two machines are quite the same. Drift makes behaviour unpredictable: a bug appears on one server but not another, and nobody knows why.

The second is the snowflake server: a machine that has been tweaked by hand so often that its exact state is undocumented and irreproducible. It is unique and fragile, like a snowflake. If it dies, you cannot confidently rebuild it, because the knowledge of how it was configured lived only in past keystrokes that nobody recorded.

CM attacks both: by making the configuration the single source of truth and re-applying it regularly, drift is corrected and no server is allowed to become a special, hand-crafted snowflake.

3 Declarative desired state

Most configuration management tools are declarative: you describe the end state you want, not the step-by-step commands to get there. You say "package nginx should be present and the nginx service should be running and enabled at boot", and the tool figures out whether that is already true and, if not, what to do about it.

This contrasts with an imperative shell script, which lists explicit commands: apt-get install nginx, systemctl start nginx. An imperative script run twice may fail or do the wrong thing on the second run, because it assumes a starting point. A declarative definition simply describes the goal, and the tool reconciles reality toward it.

Thinking in terms of desired state is the mental shift at the heart of modern CM: you maintain a description of how things should be, and the tool continuously closes the gap between that description and reality.

4 Idempotency: the central concept

Idempotency is the property that applying the same configuration multiple times produces the same result as applying it once. An idempotent task checks the current state first and makes a change only if the state does not already match the desired state. If nginx is already installed and running, an idempotent "ensure nginx is present and running" task does nothing and reports no change.

This is what makes CM safe to run again and again. You can apply the same playbook every hour; on the runs where everything already matches, nothing changes and nothing breaks. Idempotency is precisely what an imperative shell script lacks: blindly re-running useradd alice errors the second time because the user already exists, whereas an idempotent "user alice should exist" task simply confirms it and moves on.

Idempotency is the foundation of both drift correction and confidence: re-applying configuration becomes a routine, low-risk operation rather than a dangerous one.

5 Push vs pull: agentless and agent-based

Configuration management tools deliver changes in one of two architectures. In the pull (agent-based) model, every managed node runs a small agent daemon that periodically contacts a central server, fetches its desired configuration, and applies it locally. Puppet and Chef classically work this way: the agent pulls policy and reconciles the node on a schedule.

In the push (agentless) model, a control node connects out to the managed machines — typically over SSH — and applies the configuration when you tell it to. There is no permanent agent to install or maintain on the targets. Ansible is the best-known push, agentless tool.

Each has trade-offs. Pull scales naturally and self-heals on a timer, but requires installing and managing agents and a central server. Push needs nothing pre-installed on targets beyond SSH and a Python interpreter (for Ansible), making it simpler to start with, though large fleets may need orchestration to push at scale.

6 Ansible overview: agentless over SSH

Ansible is an open-source configuration management and automation tool. Its defining trait is that it is agentless: it manages remote machines by connecting to them over SSH (and WinRM on Windows), with no long-running agent installed on the targets. It runs from a control node and pushes configuration out to the managed inventory.

On each target Ansible needs only SSH access and, for most modules, a Python interpreter — both of which Linux servers usually already have. To do work, Ansible copies small programs called modules to the target, executes them, collects the results in JSON, and removes them.

Configuration is written in YAML, which is human-readable and version-control friendly. Ansible aims to be simple to learn: declarative, idempotent modules, plus a push model that needs almost no setup on the managed hosts. This low barrier to entry is a large part of why Ansible became so widely adopted.

7 The inventory: hosts and groups

The inventory is the list of machines Ansible manages and how they are organised. At its simplest it is a text file listing host names or IP addresses. Hosts are arranged into groups so you can target many machines at once — for example a webservers group and a databases group.

Groups let you say "run this on all webservers" rather than naming each host. A host can belong to several groups, and groups can themselves contain child groups, giving a flexible hierarchy. Ansible also provides the implicit all group (every host) and ungrouped.

Inventories can be static (a checked-in file, often in INI or YAML format) or dynamic (generated on the fly from a cloud provider or CMDB, so the list stays current as machines come and go). The example below shows a small static inventory in INI format.

# inventory.ini — a small static inventory

[webservers]
web1.example.com
web2.example.com

[databases]
db1.example.com ansible_host=10.0.0.21

# a group made of other groups
[production:children]
webservers
databases

8 Ad-hoc commands

An ad-hoc command is a one-off Ansible task run directly from the command line, without writing a playbook. It is the quickest way to do something across many hosts right now: check uptime, restart a service, copy a file, or confirm connectivity. You invoke the ansible command, name a host pattern, and choose a module with -m and arguments with -a.

For example, ansible webservers -m ping uses the ping module to confirm Ansible can reach and authenticate to every host in the webservers group. (Note this is not an ICMP ping; it verifies SSH login and a working Python.) Likewise ansible all -a "uptime" runs the uptime command everywhere.

Ad-hoc commands are perfect for quick, throwaway actions and exploration. For anything you want to repeat, document or version, you graduate to a playbook, which captures the same operations in a reusable, idempotent form.

# Confirm Ansible can reach every web server (SSH + Python, not ICMP)
ansible webservers -m ping

# Run an arbitrary command on all hosts
ansible all -a "uptime"

# Ensure a package is present on the databases group
ansible databases -m ansible.builtin.package -a "name=htop state=present" --become

9 Playbooks: plays, tasks and modules

A playbook is a YAML file describing the desired configuration as one or more plays. A play maps a set of hosts (a group or pattern from the inventory) to an ordered list of tasks. Each task calls a module — a reusable unit of work such as installing a package or starting a service — with parameters, and tasks run top to bottom on each targeted host.

Every well-written task is idempotent: it reports ok when the state already matches and changed when it had to act. Tasks usually carry a human-readable name so the output is easy to follow. Privilege escalation is requested with become: true (run as root via sudo).

The play below ensures nginx is installed, its config is in place, and the service is running and enabled. Read it as a description of desired state, not a script of commands.

---
- name: Configure web servers
  hosts: webservers
  become: true
  tasks:
    - name: Ensure nginx is installed
      ansible.builtin.package:
        name: nginx
        state: present

    - name: Ensure nginx is running and enabled at boot
      ansible.builtin.service:
        name: nginx
        state: started
        enabled: true

10 Common modules: package, service, copy, file, user

Modules are the building blocks of tasks; a handful cover most everyday work:

  • package — install, upgrade or remove software (a generic wrapper over apt, yum/dnf, etc.); the OS-specific modules are apt and yum/dnf.
  • service (or systemd) — start, stop, restart and enable services.
  • copy — push a file from the control node to the target, optionally setting owner, group and mode.
  • file — manage filesystem objects: create directories, set permissions and ownership, make symlinks, or ensure a path is absent.
  • user — create or remove user accounts and manage their groups, shell and home directory.

Each is declarative and idempotent: you state the desired state (for example present, started, directory, absent) and the module makes the system match it, changing nothing if it already does.

- name: Create the app group
  ansible.builtin.group:
    name: app
    state: present

- name: Create the app user
  ansible.builtin.user:
    name: app
    group: app
    shell: /bin/bash

- name: Ensure the app directory exists
  ansible.builtin.file:
    path: /opt/app
    state: directory
    owner: app
    mode: '0750'

- name: Copy the config file into place
  ansible.builtin.copy:
    src: files/app.conf
    dest: /opt/app/app.conf
    owner: app
    mode: '0640'

11 Handlers and notify

Sometimes an action should run only when something actually changed. The classic case: restart a service only if its configuration file was modified. Ansible models this with handlers. A handler is a task defined under handlers; it does nothing unless a regular task notifies it.

A task uses notify: naming a handler. If that task reports changed, the handler is flagged to run; if the task reports ok (nothing changed), the handler is not triggered. This keeps changes idempotent: editing the config restarts the service, but a no-op run leaves the service untouched.

Two important details: handlers run once, at the end of the play (not immediately), even if notified by several tasks — so a service restarts a single time no matter how many config files changed; and handlers are matched by name. The example shows a config-copy task notifying a restart handler.

---
- name: Manage nginx config
  hosts: webservers
  become: true
  tasks:
    - name: Deploy nginx config
      ansible.builtin.template:
        src: nginx.conf.j2
        dest: /etc/nginx/nginx.conf
      notify: Restart nginx

  handlers:
    - name: Restart nginx
      ansible.builtin.service:
        name: nginx
        state: restarted

12 Variables, facts and gathering

Variables let you parameterise configuration so the same playbook adapts to different hosts and environments. You might define app_port: 8080 and reference it as {{ app_port }}. Variables can be set in playbooks, in inventory, in dedicated group_vars/ and host_vars/ files, or passed on the command line, with a defined precedence order resolving conflicts.

Facts are variables Ansible discovers about the target itself at runtime — its OS family, IP addresses, memory, CPU count, hostname and much more. By default a play begins with an implicit gather_facts step (the setup module) that collects these, after which you can branch on them, for example installing the right package name per OS family.

Fact gathering can be disabled with gather_facts: false to speed up plays that do not need facts. Together, variables and facts let one definition serve a heterogeneous fleet without hard-coding host-specific values.

---
- name: Show how facts and variables combine
  hosts: all
  vars:
    greeting: "Hello from Ansible"
  tasks:
    - name: Print a gathered fact and a variable
      ansible.builtin.debug:
        msg: "{{ greeting }} on {{ ansible_facts['distribution'] }} {{ ansible_facts['distribution_version'] }}"

13 Templates with Jinja2

Static files are not always enough — you often need a config file whose contents depend on the host. Ansible solves this with the template module, which renders a Jinja2 template on the control node and writes the result to the target. Templates end in .j2 by convention.

Jinja2 supports {{ variable }} substitution, {% if %} conditionals, {% for %} loops and filters. This lets one template generate the right file for each machine: a web server config that fills in the host’s own IP from facts, or that loops over a list of upstream servers. Where the copy module pushes a fixed file unchanged, the template module produces a customised file per host.

Like everything in Ansible, templating is idempotent: if the rendered output matches what is already on the target, the task reports ok and changes nothing; only a genuine difference triggers a write (and can notify a handler).

# templates/nginx.conf.j2 — a Jinja2 template
server {
    listen {{ app_port }};
    server_name {{ ansible_facts['hostname'] }};

    {% for backend in backends %}
    # upstream: {{ backend }}
    {% endfor %}
}

14 Roles and reuse

As playbooks grow, dumping every task into one file becomes unwieldy. Roles are Ansible’s way of packaging related tasks, files, templates, handlers and default variables into a reusable, self-contained unit — for example an nginx role or a postgres role. A playbook then simply lists which roles to apply to which hosts.

A role follows a standard directory layout so Ansible knows where to find things automatically: tasks/main.yml (the task list), handlers/main.yml (handlers), templates/ (Jinja2 templates), files/ (static files to copy), defaults/main.yml (default variables, lowest precedence), vars/main.yml (higher-precedence variables) and meta/main.yml (role metadata and dependencies).

Roles make configuration modular and shareable: you can reuse the same role across projects and download community roles from Ansible Galaxy, rather than reinventing the same setup every time.

# Standard role directory layout
roles/
  nginx/
    tasks/main.yml        # the role's tasks
    handlers/main.yml     # handlers (e.g. restart nginx)
    templates/nginx.conf.j2
    files/index.html
    defaults/main.yml     # default variables (lowest precedence)
    vars/main.yml         # role variables
    meta/main.yml         # metadata and dependencies

# site.yml — apply the role
- hosts: webservers
  become: true
  roles:
    - nginx

15 Ansible Vault for secrets

Configuration often involves secrets — database passwords, API keys, TLS private keys — which must never sit in plaintext in a version-controlled repository. Ansible Vault solves this by encrypting sensitive data with a password (or key), so the values can live safely alongside your playbooks.

You can encrypt whole files (such as a vars file of secrets) with ansible-vault encrypt, or encrypt individual values inline with ansible-vault encrypt_string. The content is stored as AES-256 ciphertext; at run time you supply the vault password with --ask-vault-pass or a vault password file, and Ansible decrypts the data in memory to use it.

The result is that secrets are protected at rest yet still usable by automation, and a stolen repository does not leak credentials. The example shows encrypting a secrets file and running a playbook that needs the vault password.

# Encrypt a file of secrets in place
ansible-vault encrypt group_vars/all/vault.yml

# Edit an encrypted file later
ansible-vault edit group_vars/all/vault.yml

# Run a playbook that needs the vault password
ansible-playbook site.yml --ask-vault-pass

16 Conditionals and loops

Real configurations need to make decisions and repeat work. Ansible provides conditionals via the when: keyword: a task runs only if its expression is true. You commonly branch on a fact — for instance, use the apt module when: ansible_facts['os_family'] == "Debian" and the yum module on Red Hat family hosts — so one playbook handles multiple distributions.

Loops let a single task iterate over a list using the loop: keyword, with each element exposed as {{ item }}. Installing ten packages or creating five users becomes one concise, idempotent task instead of ten copy-pasted ones.

Both features keep playbooks compact and adaptable. Combined with variables and facts, conditionals and loops let a small, readable definition cover a wide variety of hosts and situations without duplication.

- name: Install several packages in one task
  ansible.builtin.package:
    name: "{{ item }}"
    state: present
  loop:
    - git
    - curl
    - htop

- name: Install Apache only on Debian-family hosts
  ansible.builtin.apt:
    name: apache2
    state: present
  when: ansible_facts['os_family'] == "Debian"

17 Tags and limiting runs

You do not always want to run an entire playbook. Ansible offers two ways to narrow what executes. Tags label tasks or roles; you then run only the parts you care about with --tags, or skip parts with --skip-tags. For example, tagging tasks config lets you re-push configuration with --tags config without reinstalling packages.

To narrow which hosts are affected, use --limit (often written -l). ansible-playbook site.yml --limit web1.example.com runs the whole playbook against just that one host — invaluable for testing a change on a single machine before rolling it out to the group.

Two more safety tools round this out: --check performs a dry run, reporting what would change without changing anything, and --diff shows the textual differences. Together, tags, limits and check mode let you apply exactly the right change to exactly the right hosts, cautiously.

# Run only tasks tagged 'config'
ansible-playbook site.yml --tags config

# Skip the slow 'packages' tasks
ansible-playbook site.yml --skip-tags packages

# Apply to a single host only, as a dry run showing diffs
ansible-playbook site.yml --limit web1.example.com --check --diff

18 Comparing Ansible, Puppet, Chef and SaltStack

Ansible is not the only configuration management tool, and the alternatives differ chiefly along two axes: push vs pull and how the desired state is expressed.

  • Ansiblepush and agentless, over SSH; playbooks in YAML. Easy to start, no agents to manage.
  • Puppet — classically pull and agent-based; a Puppet agent on each node pulls from a Puppet master. State is described in Puppet’s own declarative DSL (the manifest).
  • Chef — also pull and agent-based; uses "recipes" and "cookbooks" written in a Ruby-based DSL, leaning more procedural.
  • SaltStack (Salt) — primarily push using a fast message bus between a Salt master and minion agents (so it usually has agents, unlike Ansible), with declarative state files (SLS) in YAML; it also supports an agentless SSH mode.

There is no universally "best" tool. Ansible’s agentless simplicity made it the popular default; Puppet and Chef’s pull model self-heals on a timer and scales naturally; Salt emphasises speed at large scale. The shared ideas — declarative desired state and idempotency — matter more than any single product.

🎓 Certificate of Completion

🔒 Complete every lesson quiz above with 90%+ to unlock your downloadable certificate.