Skip to main content

Free 30-min security demo  — We'll scan your real code and show live findings, no commitment Book Now

Offensive360
ZeroDays CVE-2024-21626
High CVE-2024-21626 CVSS 8.6 runc Go

Leaky Vessels: runc Container Escape via /proc/self/cwd File Descriptor Leak

CVE-2024-21626 is a high-severity container escape in runc that allows attackers to break out of Docker, Kubernetes, and other container runtimes by exploiting an internal file descriptor leak to gain access to the host filesystem.

Offensive360 Research Team
Affects: < 1.1.12
Source Code View Patch

Overview

CVE-2024-21626, part of a set of four vulnerabilities collectively nicknamed Leaky Vessels, is a container escape vulnerability in runc — the low-level container runtime underpinning Docker, Kubernetes, Podman, and virtually every other container platform. With a CVSS score of 8.6 and a patch available in runc v1.1.12, this vulnerability demands urgent attention from anyone running containers in production.

The vulnerability stems from an internal file descriptor for /proc/self/cwd being leaked into the container’s process environment. By carefully crafting a container image’s working directory (WORKDIR) or runtime working directory to reference this leaked file descriptor, an attacker can cause runc to open paths relative to the host filesystem root rather than the container’s mount namespace — effectively escaping the container boundary entirely.

This vulnerability affects not just deliberate attackers but also malicious container images — meaning a compromised or malicious image pulled from a registry could automatically escape the container and compromise the host.

Technical Analysis

runc sets up a container by performing several privileged operations in the host namespace before entering the container’s namespaces. During this setup, runc opens a file descriptor to /proc/self/cwd as part of its internal operations.

The critical bug: this file descriptor was not closed before the container process was executed. Because file descriptors are inherited across exec(), the container process ends up holding an open file descriptor that points to a directory in the host filesystem.

// VULNERABLE pattern (simplified) — runc internal/libcontainer setup
// A file descriptor to /proc/self/cwd (host context) was inadvertently
// kept open and inherited by the container process.

// When WORKDIR in Dockerfile was set to a path like:
// WORKDIR /proc/self/fd/<leaked_fd_number>
//
// runc would resolve this path in the CONTAINER, but the fd itself
// points to the HOST's working directory — breaking the namespace boundary.

An attacker exploiting this via a malicious container image sets WORKDIR (or runtime --workdir) to a path that traverses through the leaked file descriptor:

# Malicious Dockerfile
FROM ubuntu:latest
# The fd number (e.g., 7) is the leaked /proc/self/cwd descriptor
WORKDIR /proc/self/fd/7/../../../../
# Now the working directory is the HOST root filesystem
CMD ["/bin/bash", "-c", "ls /etc/shadow"]  # reads host's /etc/shadow

When runc executes the container’s entrypoint with this working directory, it resolves the path against the leaked host file descriptor, giving the container process a current working directory of / on the host — a complete container escape.

Impact

This vulnerability enables:

  • Container-to-host escape: Any process inside the container can read, write, or execute files on the host filesystem depending on the runc process’s privileges
  • Host filesystem exfiltration: Read /etc/shadow, SSH keys, service account tokens, secrets mounted at known paths
  • Persistence: Write SSH keys to ~root/.ssh/authorized_keys on the host
  • Kubernetes cluster compromise: From a compromised pod, access node credentials and escalate to cluster-admin

The attack can be triggered by:

  1. A malicious container image (supply chain attack — no attacker interaction required)
  2. Any user with permission to specify WORKDIR or --workdir in a container runtime
  3. A Dockerfile build process that processes untrusted Dockerfile inputs

Critically, this is exploitable without any code execution inside the container — simply setting the working directory is sufficient.

How to Fix It

Upgrade runc immediately:

# Check current version
runc --version

# Update Docker (includes runc update)
sudo apt update && sudo apt install docker-ce docker-ce-cli containerd.io

# For standalone runc
# Download v1.1.12 from https://github.com/opencontainers/runc/releases/tag/v1.1.12

# Verify
runc --version  # Should show runc version 1.1.12

The fix in runc v1.1.12 ensures all internal file descriptors are closed before handing control to the container process, and adds explicit validation that the resolved working directory is within the container’s root filesystem:

// FIXED — runc ensures fd cleanup before container exec
// All file descriptors opened during setup are now explicitly closed
// using CloseOnExec flags and explicit cleanup before exec()

// Additionally, WORKDIR is now validated against the container rootfs:
func validateWorkdir(rootfs, workdir string) error {
    resolved, err := securejoin.SecureJoin(rootfs, workdir)
    if err != nil {
        return err
    }
    // Verify the resolved path is actually inside rootfs
    if !strings.HasPrefix(resolved, rootfs) {
        return fmt.Errorf("workdir %q resolves outside container root", workdir)
    }
    return nil
}

Defense in depth — even without patching immediately:

  • Use read-only root filesystems (--read-only) for containers where possible
  • Apply seccomp profiles to restrict syscalls
  • Run containers as non-root users
  • Implement image scanning in your CI/CD pipeline to detect malicious WORKDIR instructions

Our Take

Leaky Vessels demonstrates a class of vulnerability that is particularly insidious: the bug is in the container runtime infrastructure itself, not in user code. Developers building and deploying containerized applications have no visibility into runc internals — they trust the runtime to enforce the boundary.

This is why infrastructure dependency scanning is as important as application-level SCA. runc, containerd, and similar low-level components are exactly the kind of deeply-trusted-but-rarely-updated dependencies that stay vulnerable for extended periods.

From an enterprise security perspective, any organization running Kubernetes or Docker in production should treat unpatched container runtimes as a critical finding. The ease of exploitation via malicious images makes this a supply chain security concern as much as a patching one.

The practical lesson for development teams: treat container runtime versions with the same urgency as operating system patches. runc --version should be part of your security baseline checks.

Detection with SAST

While this specific vulnerability is in the runtime rather than application code, SAST tools contribute to detection in two ways:

  1. Dockerfile analysis: Scan WORKDIR instructions for paths referencing /proc/, /dev/, file descriptors, or symbolic link sequences — flag for review
  2. IaC scanning: Detect container configurations specifying unusual working directories in Kubernetes Pod specs, Docker Compose files, and Helm charts

Offensive360’s IaC scanner checks for anomalous WORKDIR and workingDir values in Dockerfiles and Kubernetes manifests as part of its container security ruleset.

References

#container-escape #runc #docker #kubernetes #file-descriptor-leak #path-traversal

Detect this vulnerability class in your codebase

Offensive360 SAST scans your source code for CVE-2024-21626-class vulnerabilities and thousands of other patterns — across 60+ languages.