Aytict Academy

1 What is the software supply chain?

The software supply chain is everything that goes into building and shipping your software: third-party libraries, base container images, compilers and build tools, the build system itself, package registries, plugins, and the people and accounts with access. Your final artifact is the sum of all of these inputs — not just the code you wrote.

Modern applications are mostly assembled rather than written from scratch. A typical project pulls in hundreds or thousands of open-source components transitively, so the attack surface extends far beyond your own repository to every dependency and every step that touches your build.

2 Why the supply chain is a target

Attackers favour the supply chain because it offers leverage: compromising one widely used component or build system can reach thousands of downstream victims at once. Instead of breaking into each target individually, an attacker poisons a shared upstream and lets normal update mechanisms distribute the malicious code.

This is an asymmetric bargain. Defenders must secure every link, while an attacker only needs one weak dependency, one stolen maintainer credential, or one tampered build step. Trust is transitive: when you trust a package, you implicitly trust everyone who can publish to it.

3 Case study: the SolarWinds build compromise

In the SolarWinds incident (disclosed 2020), attackers gained access to the company’s build infrastructure and injected malicious code (the SUNBURST backdoor) into the Orion product during the build. The compromise happened after the source was committed but before the final signed binary was produced.

Because the tampered artifact was then signed with SolarWinds’ legitimate certificate and shipped through normal updates, it looked authentic to customers. The key lesson: a clean source repository is not enough — the build system itself must be trusted and its output must be verifiable, because signing a backdoored binary still produces a validly signed backdoor.

4 Case study: Log4Shell and transitive blast radius

Log4Shell (CVE-2021-44228, disclosed December 2021) was a critical remote-code-execution vulnerability in Log4j, an extremely common Java logging library. A crafted log message could trigger a JNDI lookup and execute attacker-controlled code.

The crisis was not just the bug’s severity but its reach: Log4j is buried deep as a transitive dependency inside countless applications and appliances, so many teams could not even tell whether they were affected. It showed why you must know your full dependency inventory — you cannot patch what you cannot see.

5 Malicious and typosquatted packages

Public registries such as npm and PyPI let anyone publish. Attackers exploit this by uploading malicious packages that exfiltrate secrets, install backdoors, or mine cryptocurrency on install. A common trick is typosquatting: registering a name that looks like a popular one (for example reqeusts instead of requests) so a typo installs the malicious version.

Related tactics include brandjacking (impersonating a known project) and abusing install-time scripts (such as npm postinstall hooks) that run arbitrary code the moment a package is added. Treat every new dependency as code you are about to execute with your privileges.

6 Dependency confusion attacks

Dependency confusion (popularised by researcher Alex Birsan in 2021) abuses how package managers resolve names across multiple sources. If a company uses an internal package called acme-utils but the build also consults the public registry, an attacker can publish a public package with the same name and a higher version number.

Many tools then prefer the highest version regardless of source, so the malicious public package is pulled in instead of the trusted internal one. Defences include scoping or namespacing internal packages, configuring registries so private names never fall through to the public index, and pinning sources explicitly.

7 Transitive dependency risk

A direct dependency is one you explicitly declare. A transitive (indirect) dependency is one pulled in by your dependencies, often several layers deep. Most of the code in a typical project is transitive — you may have a handful of direct dependencies but hundreds of indirect ones.

This matters because a vulnerability or malicious change anywhere in that tree affects you, even though you never chose that package directly. To manage the risk you need full visibility into the entire dependency graph, not just the top level, and tooling that can flag issues anywhere in the tree.

8 Lockfiles and pinning versions

A version range (for example ^1.2.0) tells the package manager to accept newer compatible releases. That is convenient but non-deterministic: two installs at different times can resolve to different code. A lockfile (such as package-lock.json, poetry.lock or Cargo.lock) records the exact resolved version of every package, including transitive ones.

Committing the lockfile and installing strictly from it gives reproducible, deterministic installs: everyone and every build gets byte-for-byte the same dependency set. Pinning exact versions also means a freshly published malicious release does not silently slip in until you deliberately update.

# Install strictly from the lockfile (deterministic, fails if it would change)
npm ci            # Node.js, uses package-lock.json
pip install -r requirements.txt   # with pinned == versions / hashes
poetry install --sync             # honours poetry.lock
cargo build --locked              # refuses to update Cargo.lock

9 Verifying integrity with hashes and checksums

Pinning a version by name is not enough on its own — an attacker who can tamper with a registry or mirror could replace the file behind that version. Cryptographic hashes (checksums) close this gap. A hash such as SHA-256 is a fixed-size fingerprint of the file’s contents; change a single byte and the hash changes completely.

Package managers support hash pinning: the lockfile stores the expected hash (often an integrity field), and the installer recomputes and compares it on download. If they differ, the install fails. This verifies you received exactly the bytes that were recorded, defeating tampering in transit or at the mirror.

# Compute a SHA-256 checksum and compare to an expected value
sha256sum app-1.4.2.tar.gz
# 9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08  app-1.4.2.tar.gz

# pip can require hashes for every package; mismatch aborts the install
pip install --require-hashes -r requirements.txt

10 What is an SBOM?

A Software Bill of Materials (SBOM) is a formal, machine-readable inventory of the components that make up a piece of software: each library, its version, supplier, and often its license and cryptographic hash. Think of it as the ingredients label for your application.

An SBOM does not by itself make software more secure, but it is the foundation for everything else: you cannot assess vulnerability exposure, respond to a new CVE, or check licenses without first knowing exactly what is inside. When the next Log4Shell appears, an accurate SBOM lets you answer “are we affected?” in minutes instead of days.

11 SBOM formats: CycloneDX and SPDX

Two open standards dominate. CycloneDX, stewarded by OWASP, is lightweight and security-focused; it represents components, dependency relationships, vulnerabilities and more, and is popular for application security use cases. SPDX (Software Package Data Exchange) is an ISO/IEC standard originating from the Linux Foundation, with strong roots in license compliance and broad enterprise adoption.

Both are machine-readable (commonly JSON) and can describe the same software; the choice often depends on your tooling and whether your emphasis is security or licensing. Many tools can emit either format, and converters exist between them, so SBOMs remain portable across ecosystems.

12 Generating and consuming SBOMs

You generate an SBOM with tools that inspect your project: scanners read lockfiles, manifests and container image layers to enumerate components. Examples include Syft, Trivy, and many language and CI plugins. Generation should happen as part of the build, so the SBOM reflects exactly what shipped, and the SBOM should be stored as a release artifact.

You consume SBOMs by feeding them into other tools: vulnerability scanners cross-reference components against advisory databases, license tools check compliance, and policy engines gate releases. An SBOM that is generated once and never read provides little value — the payoff comes from continuously matching it against new advisories.

# Generate an SBOM from a container image (CycloneDX JSON) with Syft
syft registry.example.com/app:1.4.2 -o cyclonedx-json > sbom.json

# Consume it: scan the SBOM for known vulnerabilities with Grype
grype sbom:sbom.json

13 Vulnerability management with SCA and advisories

Software Composition Analysis (SCA) tools identify your components and cross-reference them against vulnerability databases to flag known issues. The most-cited sources are NVD (the U.S. National Vulnerability Database, built on CVE identifiers) and OSV (Open Source Vulnerabilities), an open, distributed database designed for precise, per-ecosystem, per-version matching.

Effective vulnerability management is continuous, not a one-off scan: new advisories appear daily for components you already shipped. Findings must be triaged — not every reported CVE is reachable or exploitable in your context — and prioritised by severity, exploitability and exposure rather than treated as an undifferentiated list.

14 Automated dependency updates and the auto-merge risk

Tools like Dependabot and Renovate watch your dependencies and open pull requests to bump versions, especially for security fixes. This keeps you current and shrinks the window in which a known vulnerability is exploitable, which is genuinely valuable.

But blind auto-merge is dangerous: if a PR that pulls in a new upstream release is merged and deployed without review or testing, a compromised or buggy release flows straight into production. The safe pattern is to automate the noisy work (opening PRs, running tests) while keeping a human gate, or auto-merging only after CI passes and only for low-risk, well-trusted updates.

15 Code signing and artifact signing

Signing uses cryptography to provide two guarantees about an artifact: authenticity (it really came from the claimed publisher) and integrity (it has not been altered since signing). The signer uses a private key; anyone can verify with the corresponding public key. Signing does not make code safe — SolarWinds was validly signed — it only proves who produced it and that it is unchanged.

Traditional signing requires guarding long-lived private keys, which is operationally hard. Sigstore and its cosign tool simplify this with keyless signing: you authenticate with an existing identity (such as an OIDC provider), receive a short-lived certificate, sign, and the event is recorded in a public transparency log. There is no long-lived key to steal or rotate.

# Keyless signing of a container image with cosign (Sigstore)
COSIGN_EXPERIMENTAL=1 cosign sign registry.example.com/app:1.4.2

# Verify the signature and the signer identity
cosign verify registry.example.com/app:1.4.2 \
  --certificate-identity=ci@example.com \
  --certificate-oidc-issuer=https://token.actions.githubusercontent.com

16 Provenance and attestations

Provenance is verifiable metadata describing how, where and from what an artifact was built: the source commit, the builder, the build parameters, and the inputs. An attestation is a signed statement binding such a claim to a specific artifact (identified by its hash), so anyone can check that the claim was made by a trusted party about exactly that artifact.

The in-toto framework standardises this: it defines a layout of expected build steps and collects signed link/attestation metadata at each step, letting a verifier confirm that the artifact really followed the declared process. Provenance answers questions signing alone cannot, such as “was this built from the commit it claims, by the expected pipeline?”

17 The SLSA framework and its levels

SLSA (Supply-chain Levels for Software Artifacts, pronounced “salsa”) is a framework of incremental security requirements focused on build integrity and provenance. It gives a common language for how trustworthy an artifact’s build process is, organised into ascending levels.

Roughly: the lowest level asks for build automation and provenance to exist; higher levels require that provenance be generated by the build platform (not forgeable by the build script) and that builds run in hardened, isolated environments resistant to tampering. SLSA does not assess your source code’s quality — it raises confidence that the artifact you have is the genuine, untampered output of the build it claims.

18 Reproducible builds

A reproducible build is one where the same source and the same recorded build environment always produce bit-for-bit identical outputs. This requires eliminating sources of non-determinism such as embedded timestamps, build paths, locale, and unordered file listings.

Reproducibility is powerful for the supply chain because it makes builds verifiable by independent parties: anyone can rebuild from source and confirm the published binary matches, with no need to trust a single build server. It directly counters SolarWinds-style tampering — an injected change would alter the output, so the rebuilt artifact would no longer match.

19 Securing the build system and ephemeral environments

The build system is high-value: it has access to source, secrets and signing keys, and its output is implicitly trusted. Harden it by following least privilege (scoped, short-lived credentials), isolating build steps from each other, and avoiding running untrusted code with access to release secrets.

Ephemeral build environments — fresh, disposable runners created per build and destroyed afterwards — are a key defence. Because nothing persists between builds, an attacker cannot establish a foothold that survives, and each build starts from a known-clean state, reducing the chance that a previous compromised job contaminates the next.

20 Trusted registries, proxies and avoiding confusion

Rather than pulling directly from public registries on every build, mature teams use a trusted internal registry or proxy (such as an artifact repository) that caches and curates approved packages. This gives a single control point to scan, allow-list, and retain known-good versions even if an upstream package is later deleted or yanked.

A well-configured proxy also helps prevent dependency confusion: internal package names are served only from the private feed and never fall through to the public index, so an attacker cannot shadow them with a higher public version. Combine this with scoped namespaces and explicit source pinning for defence in depth.

21 Policy for third-party code review

Technology is not enough without policy. A mature organisation defines rules for adopting third-party code: which licenses are acceptable, minimum project health signals (maintenance activity, number of maintainers, responsiveness to security reports), and a vetting step before a new dependency is added.

Practical controls include an approval/allow-list process for new packages, scheduled review of existing dependencies, requiring SCA and SBOM checks to pass in CI, and a documented response plan for when a dependency is found to be vulnerable or malicious. The aim is to make secure choices the default path rather than relying on individual vigilance.

22 SLSA build levels in detail

The SLSA v1.0 Build track defines four levels (L0–L3) of increasing assurance. L0 means no guarantees at all. L1 requires that the build is scripted/automated and that provenance exists describing how the artifact was produced — though at L1 the provenance may be self-reported and is not protected from forgery.

L2 raises the bar: the build must run on a hosted build platform and the provenance must be signed by that platform, making casual tampering detectable. L3 adds hardened builds: the platform must prevent the build process from influencing its own provenance and must isolate runs so that secrets and one build cannot leak into or forge another. Each level is cumulative — you cannot claim L3 without also meeting L1 and L2.

23 in-toto layouts and link metadata

in-toto secures the entire software supply chain end to end by treating it as a sequence of steps. The project owner defines a signed layout (the root.layout file): it lists the expected steps, who is authorised to perform each (by their public key), what materials each step consumes and what products it produces, and optional inspections to run during verification.

As each step runs, the functionary records a signed link metadata file capturing the materials it read and the products it created. At the end, a verifier checks the layout’s signature, confirms every step was performed by an authorised party, and matches the products of one step to the materials of the next. If a file was swapped between steps, the hashes will not line up and verification fails.

24 The Update Framework (TUF)

The Update Framework (TUF) secures software update systems so that even a compromised repository or mirror cannot push malicious updates to clients. Its central idea is a hierarchy of signing roles with separated responsibilities: root (establishes trust and the other keys), targets (signs the actual files and their hashes), snapshot (signs a consistent view of all metadata), and timestamp (frequently re-signed to prove freshness).

This design gives strong properties: compromise resilience (no single key compromise is catastrophic, because high-value root keys can be kept offline with thresholds), protection against rollback and freeze attacks (timestamp/snapshot metadata proves clients are seeing current data), and resistance to mix-and-match attacks. Sigstore’s own trust root is distributed using TUF.

25 Sigstore internals: Fulcio and Rekor

Sigstore’s keyless flow rests on two services. Fulcio is a certificate authority that issues short-lived (roughly 10-minute) code-signing certificates. You prove your identity with an OIDC token; Fulcio binds that identity (such as an email or CI workload identity) into the certificate. Because the certificate expires almost immediately, there is no long-lived private key to steal.

Rekor is an immutable, append-only transparency log built on a Merkle tree. Every signing event is recorded there, so a verifier can confirm the signature was made while the certificate was valid and that the record cannot be silently altered or removed. The append-only log also lets the community detect misissued certificates after the fact, similar to Certificate Transparency for TLS.

# Verify and inspect a Rekor transparency-log entry by its index
rekor-cli get --log-index 12345678

# Search the log for all entries made by an identity
rekor-cli search --email ci@example.com

26 VEX: Vulnerability Exploitability eXchange

An SBOM plus a scanner often produces a flood of CVE matches, many of which are not actually exploitable in your product — the vulnerable code path may never be reached, or a mitigation may already be in place. VEX (Vulnerability Exploitability eXchange) is a standard way for a supplier to publish that judgement in machine-readable form.

A VEX statement attaches a status to a (product, vulnerability) pair: typically not affected, affected, fixed, or under investigation, often with a justification such as “vulnerable component not present in the execution path”. Consumers use VEX to suppress false-positive noise and focus remediation on vulnerabilities that genuinely matter, turning raw scanner output into actionable signal.

27 npm provenance and PyPI trusted publishing

Registries are adding native provenance so consumers can verify where a package was built. npm provenance lets a package published from a supported CI (such as GitHub Actions) carry a signed attestation, recorded in Sigstore’s transparency log, that links the published tarball back to the exact source repository, commit and workflow. Users see a verified provenance badge and can confirm the package was not built on someone’s laptop.

PyPI trusted publishing tackles the credential side: instead of a long-lived API token stored as a CI secret, the CI job exchanges a short-lived OIDC token for an ephemeral upload token at publish time. There is no static token to leak, and only the configured repository/workflow can publish, sharply reducing the impact of a leaked secret.

# Publish to npm with build provenance from CI (e.g. GitHub Actions)
npm publish --provenance --access public

# Verify a tarball&rsquo;s provenance attestation
npm audit signatures

28 Hermetic builds

A hermetic build is one that runs in isolation from the network and the surrounding environment, consuming only explicitly declared inputs. All sources, dependencies and toolchains are fetched and pinned before the build starts, then the build runs with networking disabled so it cannot silently reach out and pull an unpinned, mutable, or attacker-controlled artifact.

Hermeticity is closely tied to reproducibility and to higher SLSA assurance: by removing hidden inputs, the build becomes self-contained, auditable, and far harder to tamper with at build time. It also makes provenance trustworthy — the recorded inputs really are the only things that influenced the output, because nothing else could have entered.

29 Least-privilege CI with OIDC and short-lived tokens

A classic supply-chain weakness is the long-lived secret stored in CI: a cloud key or registry token that, once leaked, grants broad standing access. The modern pattern replaces these with OIDC federation. The CI provider issues a signed, short-lived identity token describing the workflow (repository, branch, environment); the target system (cloud provider, registry) is configured to trust that issuer and mint a brief, narrowly scoped credential in exchange.

This delivers least privilege on two axes: time (credentials live minutes, not months) and scope (trust conditions can require a specific repo, branch or protected environment). There is no static secret to rotate or steal, and a leaked token is largely useless once it expires. Combine with environment protection rules so only reviewed deployments can assume powerful roles.

30 Hardening artifact repositories

An artifact repository (such as Artifactory, Nexus or a container registry) is a trust hub: everything it serves is consumed downstream, so compromising it poisons many builds. Hardening starts with access control — enforce SSO/MFA, give publish rights only to the few identities that need them, and use separate read and write credentials so a build runner cannot overwrite published artifacts.

Further controls include making release repositories immutable (no overwriting a version once published), enabling signature/provenance verification on push and pull, scanning incoming artifacts, restricting which upstreams a remote/proxy repository may mirror, and keeping thorough audit logs. Treat the repository like production infrastructure, not a passive file store.

31 OpenSSF Scorecard for dependency health

OpenSSF Scorecard is an automated tool that evaluates an open-source project against a set of security checks and produces a 0–10 score. Checks probe practices that correlate with supply-chain risk: whether branch protection is enabled, whether code review is required, whether CI tests run, whether dependencies are pinned, whether binary artifacts are checked into the repo, whether the project signs releases, and whether known vulnerabilities are present.

Scorecard turns vague questions like “is this dependency well maintained?” into measurable signals you can automate. It is most useful for triage and trend-spotting — flagging risky dependencies for closer review — rather than as an absolute verdict, since a low score may reflect a small project rather than a malicious one.

# Score a project&rsquo;s repository against the OpenSSF checks
scorecard --repo=github.com/example/widget

# Show only specific checks
scorecard --repo=github.com/example/widget --checks=Branch-Protection,Signed-Releases

32 Admission-time policy enforcement

Signing and provenance only help if something enforces them at the point of use. Admission control places a gate where artifacts are deployed — for example a Kubernetes admission controller — that admits a workload only if its image satisfies policy: a valid signature from an approved identity, an attached provenance attestation meeting a minimum SLSA level, and perhaps no critical unpatched CVEs.

Tools such as Sigstore policy-controller or Kyverno evaluate these rules and reject non-compliant images before they ever run. This shifts trust from “we hope only good images get deployed” to a hard, automated control. Run policies in a warn/audit mode first to find legitimate gaps, then switch to enforcing so violations are actually blocked.

# Verify signature and provenance before deploy (gate in CI or admission)
cosign verify registry.example.com/app:1.4.2 \
  --certificate-identity=ci@example.com \
  --certificate-oidc-issuer=https://token.actions.githubusercontent.com
cosign verify-attestation --type slsaprovenance registry.example.com/app:1.4.2

33 Responding to a compromised dependency

When news breaks that a package you use has been compromised (a hijacked maintainer account, an injected payload, a malicious release), speed and inventory win. First, scope the blast radius: use your SBOMs and lockfiles to find exactly which products, versions and environments pulled the affected version — you cannot respond to what you cannot locate.

Then contain: pin or block the bad version in your proxy/registry, rotate any secrets that the malicious code (often running in install scripts or CI) could have exfiltrated, and rebuild from a known-good version. Finally, verify and learn: confirm clean rebuilds with provenance, hunt for indicators of compromise in logs, and feed lessons back into policy (faster pinning, stricter install-script controls). A rehearsed runbook turns panic into a checklist.

34 Signing-key management and rotation

Where long-lived signing keys are unavoidable, how you manage them is the security. Keep private keys in a hardware security module (HSM) or managed KMS so the raw key material never leaves protected hardware; require strong access controls and audit logging on every signing operation. For the highest-value keys (such as a trust root), keep them offline and protect their use with m-of-n thresholds so no single person can sign alone.

Plan rotation in advance: have a process to introduce a new key, distribute its public half, overlap old and new during a transition, and revoke the old key — ideally before a compromise forces an emergency rotation. This is exactly the operational burden that Sigstore’s keyless model sidesteps, but when you must hold keys, treat rotation and revocation as first-class, rehearsed procedures.

35 Regulations: EO 14028, NIST SSDF and the EU CRA

Supply-chain security is increasingly a legal requirement. The U.S. Executive Order 14028 (2021) directed federal agencies and their software suppliers toward secure development practices, including providing SBOMs and attestations of secure build processes. It catalysed much of the SBOM and provenance tooling now in common use.

NIST SSDF (Secure Software Development Framework, SP 800-218) is the practice catalogue that operationalises those goals: a set of high-level, outcome-based practices for protecting the software environment, the code, and the produced artifacts. In Europe, the Cyber Resilience Act (CRA) imposes obligations on manufacturers of products with digital elements — including handling vulnerabilities, providing security updates, and supplying an SBOM — backed by market-access enforcement. The common thread: produce evidence (SBOMs, attestations) that you build securely.

36 SBOM diffing and consumption at scale

Generating one SBOM is easy; managing thousands across many services and releases is the real challenge. SBOM diffing compares the SBOM of a new build against the previous one to highlight exactly what changed: a newly added component, a version bump, a removed package, or a license change. This makes review tractable — you focus on the delta rather than re-reading the whole inventory.

At scale, organisations push SBOMs into a central store or graph database that indexes every component across every product. When a new CVE lands, a single query answers “which of our hundreds of services ship this exact vulnerable version?” in seconds. Continuous matching of stored SBOMs against fresh advisories — not one-off scans — is what turns an SBOM program into a real risk-reduction capability.

Software Supply Chain Security Advanced

📚 Lessons & quizzes