CVE-2026-31431 "Copy Fail" — why the kernel matters
29 April 2026. Trivial local privilege escalation in the Linux kernel via the AF_ALG crypto subsystem (CVE-2026-31431, “Copy Fail”). Public PoC, cross-distribution, mitigation belongs on the host. What we deployed for our managed hosting customers — and which follow-up LPEs (Dirty Frag, Fragnesia) sit on this line.

TL;DR — the 90-second summary
- Affected?
Linux kernel 5.10 LTS through 6.6 LTS in practically all mainstream distributions — Debian, Ubuntu, RHEL, SUSE, Amazon Linux. All container hosts as well, regardless of the container distribution.
- Risk?
Local privilege escalation to root via the AF_ALG crypto subsystem. Public PoC available, trivially reproducible.
- Immediate action?
Where the kernel patch hasn't been deployed yet: disable
algif_aeadvia modprobe blacklist and restrictcrypto_useraccess. Then verify against the PoC.- Recommendation?
German Mittelstand: deploy the distribution patch, schedule a reboot. Enterprise/Kubernetes: additionally enable detection hooks (eBPF/Tetragon) on crypto-user syscalls.
- Criticality?
High (see badge in the page header).
CVE-2026-31431 is a pure kernel vulnerability. It sits in the Linux kernel's crypto subsystem, specifically in the AF_ALG interface and the handling of certain memory operations around algif_aead. The public proof-of-concept fits on a sheet of A4 paper and needs no exotic preconditions. It uses standard kernel functionality that's active on practically every Linux system, and it works cross-distribution.
The detailed technical write-up is at copy.fail. Important for framing: this is a local privilege escalation. An attacker first needs code execution as an unprivileged user on the host. Once that's given, the exploit escalates reliably to root.
A common mistake in framing is: „if the container is secure, the system is secure.“ That's true for many classes of vulnerabilities. For kernel issues, it isn't. Containers, whether based on Debian, Alpine, or Wolfi, don't ship their own kernel. They use the host system's kernel. As soon as a process inside the container addresses the kernel and escalates there, isolation is bypassed.
Who is affected?
The vulnerability is cross-distribution. What matters is not the userland but the host kernel. Container images from Debian, Alpine, Ubuntu or Wolfi are equally affected because none of them ships its own kernel.
| Component | Status | Condition |
|---|---|---|
| Linux kernel 5.10 LTS through 6.6 LTS | Affected | CONFIG_CRYPTO_USER_API_AEAD enabled (default on mainstream distros) |
| Linux kernel 6.7+ | Conditional | Affected before the distribution backport date, patched afterwards |
| Debian 11/12, Ubuntu 22.04/24.04 | Affected | Until the patch from 1–7 May 2026 |
| RHEL 8/9, Rocky, AlmaLinux | Affected | Until the distribution errata release |
| Amazon Linux 2/2023 | Affected | Until ALAS advisory; Bottlerocket separate |
| Container images (Debian/Alpine/Wolfi) | Not affected | Userland without its own kernel; the host decides |
| Kubernetes worker nodes | Affected | If the host kernel isn't patched; every pod is an entry vector |
| Managed Kubernetes (EKS/AKS/GKE) | Provider-dependent | Worker image refresh is decisive |
| CI/CD runners (self-hosted) | Highly affected | Multi-tenant workload density increases the exploitation path |
| WSL2 kernel | Affected | Until the Microsoft kernel update |
The vulnerability is particularly critical where many workloads share the same kernel: container hosts with multi-tenant setups, self-hosted Kubernetes workers, and CI/CD runners. For exactly this class of risk, we schedule AI security audits into every release.
Impact
Copy Fail is a local privilege escalation (LPE). The CVSS rating sits in the high range per the NVD preliminary score (7.8 Local Attack Vector, low complexity, no user interaction). RCE or remote escalation isn't directly possible; the exploit requires existing code execution in user space.
What that means in practice:
- Container escape effectively possible. Every compromised pod on an unpatched host can gain root on the node. From there, the entire worker node — and through it, the cluster — is exposed.
- CI/CD runners as a high-risk target. Self-hosted GitHub runners, GitLab runners, and Jenkins agents regularly execute untrusted code. A single compromised pipeline is enough — we've covered elsewhere in detail why the CI pipeline is the largest concentration point for escalations in the stack.
- Shared hosting and multi-tenant VPS. A classic entry point for cross-tenant escalation.
- Compliance consequences. Anyone operating under ISO 27001 or NIS-2 has an incident-assessment obligation as soon as an exploit becomes publicly available — regardless of the detection status.
On the business side: downtime due to reboot is the most likely direct impact. Data loss is not a realistic scenario with clean mitigation; the reputational question only arises if an incident occurs before the patch.
Copy Fail in Kubernetes — the second class of threat
On a single Linux VM, Copy Fail is a local privilege escalation. In Kubernetes the same vulnerability manifests differently: not as escalation to root, but as lateral movement between pods — without container escape, without root, without capabilities. Anyone running K8s should keep both pictures in mind.
The operational mechanics emerge from the combination of two Kubernetes properties that look harmless in isolation. First: container images are built in layers, and identical base layers are stored only once physically on a node (OverlayFS). Second: when the Linux kernel reads a file from disk, it keeps a copy in the page cache. The page cache is a node-wide resource and is not isolated per container. It's keyed by (filesystem device, inode). Two pods from the same base layer that execute /usr/bin/cat read from the same page cache entry.
This is exactly where Copy Fail strikes in K8s. An unprivileged process in a pod opens a file read-only, corrupts its page cache copy via the AF_ALG path, and waits. The next pod that executes the same file runs against the manipulated content — without anything ever being written to disk. The attacker never enters the host and sees nothing from the outside: they pick a common binary blindly (cat, bash, a shared library), corrupt it, and let Kubernetes decide which pod touches it next.
The trigger is Kubernetes itself. Liveness and readiness probes are the standard mechanism by which K8s periodically runs into pods, typically every few seconds. Stream Security validated the end-to-end flow on a production EKS cluster (Kubernetes 1.35, kernel 6.12.77, Amazon Linux 2023): between exploit launch in an unprivileged pod without a service account and code execution in a cluster-admin pod, less than ten seconds elapsed, triggered by a liveness probe executing cat /tmp/healthcheck.
The host stayed unaffected in that test. Its /usr/bin/cat sits on a different filesystem with a different inode, and the page cache cleanly separates the two entries. That means: Copy Fail in K8s is not a container-to-host escape, but container-to-container movement across the shared node layer. That doesn't make it less serious, it makes it structurally different. Network policies, RBAC, and file integrity monitoring don't see the attack because neither the network nor the persistent filesystem is touched.
The blast radius depends on your base image hygiene:
| Setup | Risk | Reason |
|---|---|---|
All workloads share the same base (e.g. ubuntu:24.04) | High | Every compromised pod reaches every other pod on the same node |
| Mix with partial overlap | Medium | Blast radius limited to pods sharing layers |
| Every workload uses its own distroless or scratch image | Low | No shared layers, no shared page cache entry |
| Privileged DaemonSets (CNI, logging, monitoring) share base with application workloads | Critical | Attacker reaches pods with cluster-admin or host-network privileges |
The painful consequence from Stream Security's demo: the attacker doesn't decide who gets hit. The Kubernetes scheduler, image layer sharing, and probe configuration do. Anyone who builds cluster-admin DaemonSets from the same base layer as unprivileged application pods has paved a path that their own platform conventions make trafficable.
Wolfi OS and the question of responsibility
At Moselwal we use Wolfi OS as our container base. For a vulnerability like Copy Fail, the clean framing matters — otherwise the question „is our container distribution to blame?“ gets answered wrong.
Wolfi is an undistro: pure userland without its own kernel. Wolfi uses the host system's kernel in full. Two things follow at the same time: Wolfi is not the cause of Copy Fail, and Wolfi cannot fix it either. Responsibility sits entirely on the host layer. The same applies 1:1 to distroless images, Chainguard images, Alpine, Debian-slim and any other lean container base: none of them ships its own kernel.
This is exactly why we stepped away from compose.yaml as a matter of principle a few weeks ago and built our container topologies declaratively. When the layers are cleanly separated — image, pod spec, host kernel — responsibility can be pinned per incident. In the case of Copy Fail: the host kernel. Image rebuilds would be busywork and would only blur the validation of the actual mitigation.
If you do run Wolfi images, you benefit indirectly: the slimmer userland attack surface reduces the chance that an attacker even reaches the point of launching a local kernel exploit. But that's defense in depth, not mitigation.
Mitigation and immediate actions
The short answer: deploy the distribution patch and reboot. Where a reboot isn't immediately possible, disable algif_aead via module blacklist and restrict crypto-user access. Both workarounds take effect at runtime.
Deploy the patch
# Debian/Ubuntu
sudo apt update && sudo apt upgrade -y linux-image-generic
sudo reboot
# RHEL/Rocky/AlmaLinux
sudo dnf upgrade -y kernel
sudo reboot
# SUSE
sudo zypper patch --category security
sudo reboot
NixOS — declarative patch and module blacklist
NixOS hosts patch differently: the kernel and the module blacklist are declared in /etc/nixos/configuration.nix, then nixos-rebuild switch pulls both in one step. Advantage: the next generation can be rolled back via the bootloader if the mitigation breaks a productive function.
# /etc/nixos/configuration.nix
boot.kernelPackages = pkgs.linuxPackages_6_6; # patched LTS channel update
boot.blacklistedKernelModules = [ "algif_aead" ];
boot.kernel.sysctl = {
"kernel.crypto_user_api" = 0;
};
# afterwards
sudo nixos-rebuild switch
sudo reboot
If you don't want to bump the kernel channel right away, declare only the module blacklist and the sysctl as a stopgap, and pull the channel bump in the next maintenance window. The declarative form makes both steps audit-proof and automatically reproducible.
Workaround without reboot: disable algif_aead
# Unload the module live (immediate effect)
sudo modprobe -r algif_aead
# Persistent blacklist
echo "blacklist algif_aead" | sudo tee /etc/modprobe.d/cve-2026-31431.conf
sudo depmod -a
Restrict crypto_user access
# sysctl stopgap: block the crypto user API for unprivileged processes
echo "kernel.crypto_user_api = 0" | sudo tee /etc/sysctl.d/99-cve-2026-31431.conf
sudo sysctl --system
Kubernetes: tighten the PodSecurity profile
apiVersion: v1
kind: Pod
metadata:
name: hardened-workload
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: registry.example.com/app:1.0.0
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
Kubernetes: base image hygiene and seccomp block
If you want to structurally minimize the K8s-specific lateral movement risk (see „Copy Fail in Kubernetes“ above), you have three levers beyond the kernel patch:
- Diverse base images. Different workloads shouldn't all sit on
ubuntu:24.04. Image layers used by only one workload don't land in a shared page cache entry, the lateral movement runs into a dead end. - Distroless or scratch for workloads. Application containers typically don't need system binaries like
/usr/bin/cator/bin/sh. What's not in the image can't be corrupted in its page cache either. This is the structural answer to Copy Fail in the K8s context. - Separate privileged DaemonSets from the workload path. CNI agents, logging sidecars, and monitoring daemons with
hostNetworkorhostPathshould never come from the same base layer as unprivileged application pods. Otherwise the page cache becomes an elevator to the cluster-admin floor.
At the pod level, a seccomp profile blocks the entry point at the syscall. The RuntimeDefault profile from Docker and Kubernetes lets AF_ALG sockets through, the block has to be set explicitly:
{
"defaultAction": "SCMP_ACT_ALLOW",
"syscalls": [
{
"names": ["socket"],
"action": "SCMP_ACT_ERRNO",
"args": [
{ "index": 0, "value": 38, "op": "SCMP_CMP_EQ" }
]
}
]
}
Rollback. The module blacklist can be reverted with sudo rm /etc/modprobe.d/cve-2026-31431.conf and a reload. Applications that use AEAD crypto in user space via AF_ALG (rare, mostly special VPN tools or crypto benchmarks) won't work without the module.
Technical deep dive
The exploitable path sits in the kernel's crypto subsystem. AF_ALG is a socket family that lets userspace processes address kernel crypto implementations, historically introduced for IPsec daemon implementations and hardware-accelerated crypto. The algif_aead module implements AEAD cipher operations (Authenticated Encryption with Associated Data) over this interface.
The vulnerability emerges in the handling of certain recvmsg() paths combined with incorrect reference counting on skb structures. Under specific race conditions, the kernel writes into a buffer that has already been freed — a classic Use-After-Free in the kernel heap. With controlled heap allocation, the freed slot can be occupied by an attacker-controlled data structure, which leads to kernel memory disclosure and ultimately privilege escalation via cred structure manipulation.
Important aspects for assessment:
- No capability requirement. Creating an
AF_ALGsocket needs neitherCAP_NET_ADMINnorCAP_SYS_MODULE. Every unprivileged process can reach the path, provided the module is loaded. - Auto-load behavior.
algif_aeadis auto-loaded on most distributions as soon as the first AF_ALG socket is opened for AEAD. Simple module unload without a blacklist isn't enough: an attacker reloads it on demand. - Container namespaces don't help. User namespaces isolate UID mappings, but not the kernel code path. Crypto sockets are opened against the host kernel regardless of container context.
- seccomp-bpf as defense in depth. A restrictive seccomp profile that limits the
socket()syscall toAF_INET/AF_INET6/AF_UNIXblocks the entry point. Docker's and Kubernetes' default seccomp profiles don't cover this.
Trade-off with the stopgap mitigation: the module blacklist only covers the AEAD path. There are related AF_ALG modules (algif_hash, algif_skcipher, algif_rng, algif_aead) that aren't all affected by the same CVE, but auditing them as part of a clean patch wave is recommended.
Detection and verification
Lead questions
- Do my Linux hosts run a kernel on the 5.10/5.15/6.1/6.6 LTS line or older — and is the applied patch below 6.18.22 / 6.19.12 / 7.0?
- Is
algif_aeadloadable or built-in on my hosts? - Do I see unusual AF_ALG socket operations from unprivileged processes in the logs — especially combined with
setresuid/setreuid/setuidshortly after? - Was the host booted in the last 7–10 days without a kernel update, while exploit code was already circulating in cybercrime forums?
Quick check per host
# Check kernel version
uname -r
# algif_aead module status
lsmod | grep algif_aead
grep CONFIG_CRYPTO_USER_API_AEAD /boot/config-$(uname -r)
# Distribution patch status
# Debian/Ubuntu
apt list --installed 2>/dev/null | grep linux-image
# RHEL/AlmaLinux/Rocky
rpm -qa | grep kernel
# SUSE
zypper search --installed-only -t package kernel-default
Falco / eBPF correlation
If you operate Falco or comparable eBPF monitoring, watch for a three-syscall correlation per process: socket(AF_ALG) → bind/accept on an aead cipher → setresuid(0,0,0) or setreuid(0,0) within a few seconds. That is the exploit signature pattern that holds across PoC variants.
A concrete Falco rule sketch:
- rule: AF_ALG Aead Followed By Setuid
desc: Unprivileged process opens AF_ALG socket and then transitions to UID 0
condition: >
evt.type = socket and
evt.arg.domain = AF_ALG and
proc.uid != 0 and
proc.aname[1] != systemd
output: >
Possible Copy Fail exploitation
(user=%user.name pid=%proc.pid command=%proc.cmdline)
priority: WARNING
The rule alone triggers false positives on legitimate workloads (e.g. some libgcrypt paths). It is meant as a correlation anchor, not a blocking rule — the additional setuid follow-up in the output collection makes the case verifiable.
Audit routine per host
For SMEs without a dedicated SOC team: once a week per host, log uname -r in a central table and compare against the currently recommended patched version. Without automation you get patch drift — and patch drift becomes expensive exactly in high-severity cycles like this one.
Operator recommendation
Operational decision block
- If you operate Linux hosts on RHEL/AlmaLinux/Rocky/CentOS Stream — then
apply the distribution patch via
dnf upgrade kerneland reboot. If you cannot reboot: KernelCare live patch for EL8/EL9 is available as a stopgap.- If you operate Ubuntu LTS hosts — then
apply the kernel update via
apt upgrade linux-image-*plus reboot. KernelCare covers Ubuntu 22.04 LTS (Jammy) including AWS and HWE variants as a live patch.- If you operate Debian stable (bookworm/trixie) — then
patches are available — Debian stable has shipped the updates since early May. If the host has not been updated yet, close that this week. Sid has
linux 7.0.4-1.- If you run EC2, Hetzner Cloud, Azure, GCP — then
cloud providers do not patch the hypervisor on your behalf. Your guest kernel has to be updated. On Amazon Linux: AWS security bulletin 2026-027 lists the concrete kernel versions.
- If you operate Kubernetes platforms — then
plan a node image update — all worker nodes need the patched kernel; otherwise compromised pods escalate to the host. Container images themselves are not affected (containers share the host kernel).
- If you have hosts where AF_ALG applications are NOT used — then
you can deactivate the module as an interim measure:
echo "install algif_aead /bin/true" > /etc/modprobe.d/blacklist-algif_aead.conf+ reboot. Caution: on the RHEL family the module is built-in, so the initcall must be blacklisted viagrubby— or just patch.
What we deliberately do not do
- No delayed update to “next maintenance window” when the CISA KEV deadline is tomorrow and the host is publicly reachable. Real-world exploitation against cloud providers is documented.
- No modprobe-only workarounds on the RHEL family. They don't work because
algif_aeadis built-in — only the patch or grubby initcall blacklist helps. - No assumption “our VM is small, who would target it?”. The exploit is 732 bytes of Python and works over any initial-access vector (SSH, web shell, compromised service account).
What we actually did at Moselwal
Our build containers and production hosts have been running on patched kernels since early May. Concretely:
- Own platform hosts (moselwal.de, ole-hartwig.eu, blog.ole-hartwig.eu, nozzleops.de): Debian stable with the early-May kernel update; reboot completed;
algif_aeadstatus verified. - Build containers (
moselwal/build-base,moselwal/typo3-builder,moselwal/frankenphp-runtime): base images rebuilt on patched distribution versions, container registry tags refreshed, all build pipelines switched to the new tags. - Customer hosts under maintenance: patches rolled in the same maintenance iteration as the May CVE wave (Composer 2.9.8, TYPO3 14.3.1/13.4.29). SBOM inventory updated per customer.
- Detection monitoring: Falco rule for
AF_ALGsocket plus subsequentsetuidtransition activated as a correlation signal in central observability.
For customers running their own cloud VMs or container platforms that we do not operate ourselves, we have distributed patch guidance and support if needed with detection scripts or audit routines.
Frequently asked questions about Copy Fail
Do Kubernetes containers or Wolfi images need to be rebuilt because of Copy Fail?+
No. The vulnerability sits in the host kernel, not in the images. Rebuilds would be activity for activity’s sake and only burn pipeline time — that holds for Wolfi just as for any other container base.
Why is the algif_aead kernel module loaded on my Linux system in the first place?+
AF_ALG is a user-space interface to the kernel crypto subsystem. Very few applications use it in production. Disabling it via modprobe blacklist is therefore typically a no-op for normal operations.
How do I verify that the Copy Fail mitigation actually takes effect on my host?+
We reproduce the public PoC from copy.fail after applying the mitigation. A host counts as cleared only when the escalation fails. A configuration line entered is not enough.
Are EC2, Hetzner Cloud, and Azure VMs automatically protected against CVE-2026-31431?+
Not automatically. Managed Kubernetes providers often ship worker images with mitigations included. Self-managed workers on EC2 or bare metal are your responsibility, and part of our audit.
When will the kernel patch for CVE-2026-31431 land in Debian, Ubuntu, and RHEL?+
As of 30 April 2026, no final mainline patch is available. Distributors will ship the patch after backporting. We track the kernel mailing list and apply the fix once it is stable and our validation has run.
As of 4 May 2026, patches have been released for most distributions. Update now! There are now targeted attacks on container environments!
We don't have an in-house security team — who mitigates Copy Fail on production Linux hosts?+
That is exactly what DevSecOps as a Service and our external IT department are for. We mitigate on your behalf, document the procedure in an audit-ready way, and hand back a verified state.
Conclusion
Copy Fail is the first of the two universal Linux LPE vulnerabilities in the May 2026 wave and the one with the strongest external indication: CISA KEV listing, FCEB remediation deadline 15 May 2026, multiple independent threat-intel confirmations of active exploitation. A 732-byte Python vulnerability that affects every Linux kernel since 2017 — that is not the edge case, it's the SME norm.
Operationally the patch is trivial: dnf upgrade kernel or apt upgrade linux-image-* plus reboot. Strategically it is a test of whether your own patch routine is fast enough to hold KEV deadlines without declaring an emergency. Anyone who has not patched between the April 2026 initial disclosure and mid-May has a pipeline weakness, not a complexity problem.
The cluster lesson: Copy Fail and Dirty Frag are variants of the same pattern — in-place optimisations in the kernel that grant unprivileged write primitives into the page cache. If you have one, you have the other ahead of you. The patch pipeline has to treat both CVEs as a connected task, not as two separate tickets.
We audit, mitigate, and validate your hosts against Copy Fail.
You give us access to your Linux hosts — we audit with SBOM inventory, deploy the module blacklist as a stopgap, follow up with the distribution patch in your maintenance window, and reproduce the public PoC before and after each step. You get an audit-ready report back, not a sales follow-up.
This is the operational routine behind DevSecOps as a Service and the External IT Department — platform operations, not advisory-on-paper.

![[Translate to English:] Foto von Kai Ole Hartwig.](/fileadmin/_processed_/e/9/csm_ole-neu_73323ad80d.jpeg)
![[Translate to English:] Dunkles Linux-Server-Rack mit drei sage-grünen Patch-Kabeln zwischen Switch-Ports; das mittlere Kabel hängt halb herausgerissen und lose vor matt-schwarzen 1U-Edge-Boxen, daneben ein deep-oxblood Label-Tag — visuelle Metapher für die dritte XFRM-LPE in drei Wochen.](/fileadmin/_processed_/9/0/csm_5b253e50be33b7376cf6c7aae4858abc60e3f4d0e7da39aec18a568f00d54050_36f920642c.jpg)


![[Translate to English:] TYPO3 14.3.1 und 13.4.29 — Maintenance-Releases im Betreiberüberblick [Translate to English:] Mattschwarze Server-Edge-Box auf Walnuss-Werkbank mit aufgeklapptem Laptop, der einen TYPO3-Backend-Pagetree zeigt; daneben zwei Kraft-Paper-Module-Cartridges mit Mono-Labels 14.3.1 und 13.4.29, im Hintergrund Mosel-Schiefer-Weinberg-Terrassen im Morgennebel.](/fileadmin/_processed_/c/6/csm_8594429e301ce9c276f63542f71775511bd1e0e5f4402532b644325d439c338f_9a3584a49e.jpg)
![[Translate to English:] Sechs nahezu identische Kraftpapier-Umschläge mit Wachssiegeln auf Beton in präziser Anordnung; einer ist seitlich geöffnet, ein dünner roter Faden zieht still zu einem leeren ledernen Wallet; daneben Messinglupe und Messingschlüssel im kühlen Nordlicht.](/fileadmin/_processed_/4/b/csm_90d5b90398618e7ff838e23aa6871149fffb04222d5a5ddb7c9d3a97ce50c64c_4b1ffe84fd.jpg)
![[Translate to English:] Ein altes Messing-Sprachrohr auf Beton, aus dem still ein dünner roter Faden über den Rand zu einem aufgeschlagenen ledernen Notizbuch zieht; drei Kraftpapier-Umschläge mit Sigeln und eine Messinglupe rahmen die Szene im kühlen Nordlicht.](/fileadmin/_processed_/5/6/csm_fe6852c689462a72ad8019d64c873316e6af6717ea48f58e7756bbede89f8cc6_30fcef2cdc.jpg)