PostgreSQL OOM Safety With Strict Overcommit

PostgreSQL is not a normal process when Linux memory pressure gets ugly. A current Hacker News discussion around Ubicloud's write-up on PostgreSQL and the OOM killer is a useful reminder that database memory policy is an SRE reliability control, not just kernel trivia.
What Is Strict Overcommit?
Linux controls virtual memory overcommit with vm.overcommit_memory. The kernel documentation describes three modes:
0: heuristic overcommit, the default on many systems.1: always overcommit, where allocations are allowed and failures may arrive later.2: strict overcommit, where the kernel refuses allocations that would exceedCommitLimit.
In strict mode, the kernel compares committed virtual memory, visible as Committed_AS in /proc/meminfo, with a limit based on swap plus vm.overcommit_ratio or vm.overcommit_kbytes. If an allocation would exceed that budget, it fails early with ENOMEM.
For PostgreSQL, early failure is often the safer failure.
Why PostgreSQL Cares
PostgreSQL uses a postmaster process and many backend processes. Those backends share important memory areas such as shared buffers, WAL buffers, locks, and other state. If the Linux OOM killer terminates one backend while it is touching shared memory, PostgreSQL cannot assume the remaining process group is safe.
So PostgreSQL does the right thing. It terminates the remaining backends, drops active connections, aborts in-flight work, and performs crash recovery on restart. That protects data integrity, but it turns one memory victim into a database-wide disruption.
Strict overcommit changes the failure mode. Instead of letting the host promise too much memory and later killing a process, the kernel can reject the allocation. PostgreSQL can report an error to the client, cancel the transaction, and keep the rest of the server alive.
Quick Runbook
Start by measuring the host before changing policy:
grep -E 'MemTotal|CommitLimit|Committed_AS|SwapTotal' /proc/meminfo
sysctl vm.overcommit_memory vm.overcommit_ratio vm.overcommit_kbytes
ps -C postgres -o pid,vsz,rss,cmd --sort=-vsz | head
Then check whether the host is a good candidate. Strict overcommit works best on dedicated PostgreSQL machines with a small, known sidecar set: exporters, backup agents, log shippers, and monitoring daemons. It is riskier on shared nodes where unrelated services can consume the commit budget and cause PostgreSQL allocations to fail.
For a dedicated server, a practical starting point is to use vm.overcommit_memory=2 and an absolute vm.overcommit_kbytes budget. Ubicloud describes a fleet heuristic based on roughly 80 percent of memory plus a fixed sidecar allowance, adjusted for huge pages. Treat that as an input, not a universal constant.
Operational Tips
- Track
CommitLimitandCommitted_ASwith the same seriousness as free memory. - Alert when
Committed_AS / CommitLimitapproaches your tested headroom. - Keep sidecar committed memory visible, especially Go-based agents that reserve large virtual regions.
- Test backup, failover, maintenance jobs, and connection spikes before enabling strict mode on primaries.
- Check kernel versions if
Committed_ASgrows strangely while RSS and accountable mappings look normal.
The kernel accounting point is not theoretical. Ubicloud found a Linux 6.5 committed-memory accounting bug that inflated Committed_AS over time and caused false allocation failures under strict overcommit. That kind of failure is exactly why rollout should include kernel baselines, canaries, and clear rollback steps.
Conclusion
Strict overcommit is not a magic PostgreSQL tuning knob. It is a way to choose a smaller, earlier failure over a late OOM kill that can restart the whole database. For SRE teams running dedicated PostgreSQL hosts, it belongs in the same review as huge pages, connection limits, sidecars, backups, and failover testing.
At Akmatori, we build AI agents for SRE teams that help investigate alerts, inspect infrastructure, and automate operational workflows. If you want a managed edge and cloud foundation for resilient database platforms, explore Gcore for infrastructure that pairs well with production PostgreSQL.
