Skip to main content
29.06.2026

NUMA Drift Debugging for SREs

head-image

A fresh Edera deep dive on NUMA is getting attention because it explains a production problem many operators have seen: two identical workloads on the same class of machine, one consistently slower because its memory sits farther from the CPUs running it. On modern multi-socket and chiplet servers, that distance matters.

What Is NUMA Drift?

NUMA means Non-Uniform Memory Access. A server is split into memory nodes, each close to a group of CPUs. Local memory access is fast. Remote access crosses an interconnect, which adds latency and can reduce bandwidth when the system is busy.

NUMA drift happens when a workload's CPU placement and memory placement stop lining up. The Linux kernel documents several memory policy scopes, including system default, task policy, VMA policy, and shared policy. Tools such as numactl expose those policies for operators who need to inspect or control placement.

Why SRE Teams Should Care

  • Tail latency gets worse before average CPU looks suspicious.
  • Databases, JVMs, caches, and AI inference servers can lose throughput without an obvious error.
  • Kubernetes CPU pinning helps less if memory placement is ignored.
  • Virtual machines can inherit bad topology from the host or hypervisor.
  • BIOS node interleaving can hide topology from Linux and disable NUMA-aware tuning.

For incident response, NUMA issues are frustrating because the symptom looks like generic slowness. The useful clue is locality: which CPUs ran the process, where its pages landed, and whether remote memory traffic increased during the bad window.

Installation

Most Linux distributions package numactl and related utilities:

sudo apt-get install numactl
numactl --hardware

On RPM-based systems:

sudo dnf install numactl
numactl --show

For deeper inspection, also check /proc/<pid>/numa_maps, lscpu, and node counters under /sys/devices/system/node/.

Practical Debugging Workflow

Start with topology:

lscpu | grep -E 'NUMA|Socket|CPU\(s\)'
numactl --hardware

Then inspect the target process:

pidof postgres
grep -E 'N[0-9]+=' /proc/<pid>/numa_maps | head

If a workload is mostly running on one NUMA node while allocating heavily from another, test a constrained launch in staging:

numactl --cpunodebind=0 --membind=0 ./service

Use --interleave=all carefully. It can smooth worst-case placement, but it also gives up the peak benefit of local memory. For latency-sensitive systems, prefer explicit CPU and memory alignment after measuring.

Operational Tips

Document NUMA topology for large nodes in your runbooks. Capture numactl --hardware in host inventory, especially for database servers, Kubernetes worker pools, and AI inference machines.

In Kubernetes, review CPU Manager, Topology Manager, huge pages, and device plugin behavior together. A pod can have pinned CPUs and still suffer if memory allocation or accelerator placement crosses the wrong boundary.

Treat firmware settings as production configuration. Node interleaving, Sub-NUMA Clustering, and AMD Nodes Per Socket can change what the operating system sees. Record those settings before and after hardware refreshes.

Conclusion

NUMA is not an exotic tuning topic anymore. It is part of how modern servers expose distance between cores, memory, and devices. When a workload is slow only on certain hosts, NUMA locality belongs in the first round of SRE checks.

Akmatori helps SRE teams connect agents to real operations workflows with the right context, tools, and guardrails. If you are building AI-assisted incident response or platform automation, explore Akmatori and deploy it on reliable cloud infrastructure from Gcore.

Automate incident response and prevent on-call burnout with AI-driven agents!