pg_stat_ch: Stream Every PostgreSQL Query to ClickHouse

Quick Reference:
# Clone and build
git clone https://github.com/ClickHouse/pg_stat_ch
cd pg_stat_ch && make && make install
# Enable in postgresql.conf
shared_preload_libraries = 'pg_stat_ch'
pg_stat_ch.clickhouse_host = 'clickhouse.example.com'
pg_stat_ch.clickhouse_database = 'postgres_events'
# Restart PostgreSQL
systemctl restart postgresql
ClickHouse just open-sourced pg_stat_ch, a PostgreSQL extension that captures every query execution as a fixed-size event (~4.6KB) and streams it to ClickHouse. This gives you APM-level observability for PostgreSQL without the overhead of traditional monitoring.
Why This Matters
Traditional PostgreSQL monitoring with pg_stat_statements gives you aggregated stats. You see total execution time and call counts, but you lose the individual events. When something goes wrong at 2:47 PM, you can't drill into exactly which queries ran during that minute.
pg_stat_ch changes this. Every SELECT, INSERT, UPDATE, DDL statement, and even failed queries become queryable events in ClickHouse. You get:
- p50 to p99 latency over any time window
- Top queries by runtime with full context
- Error patterns by application, user, or database
- Historical comparison across days or months
How It Works
The architecture is elegantly simple:
PostgreSQL Backend → Ring Buffer → Background Worker → ClickHouse
(memcpy) (shared) (1s flush) (native protocol)
Hot path is fast: Each query completion triggers a
memcpyinto a shared-memory ring buffer. No network calls, no blocking.Background worker handles export: Once per second, a worker drains up to 10,000 events, compresses them with LZ4, and sends to ClickHouse via native binary protocol.
ClickHouse does the heavy lifting: Raw events land in
events_raw. Materialized views pre-aggregate into dashboards you can query immediately.
Event Schema
Each event captures 45 fields:
| Category | Fields |
|---|---|
| Timing | execution time, planning time, lock wait time |
| I/O | shared/local buffer hits and reads, temp blocks |
| WAL | records, FPI, bytes |
| CPU | user time, system time (via getrusage) |
| JIT | functions, generation/optimization/emission time |
| Context | database, user, application_name, client_addr |
| Query | query_id fingerprint, truncated text (2KB default) |
Pre-Built Materialized Views
pg_stat_ch ships with four materialized views in ClickHouse:
-- Rolling 1-hour window for real-time monitoring
SELECT * FROM events_recent_1h
WHERE query_id = 12345678;
-- 5-minute buckets with percentiles
SELECT
toStartOfFiveMinutes(event_time) AS bucket,
quantilesTDigest(0.50, 0.95, 0.99)(total_exec_time) AS latency_pcts
FROM query_stats_5m
WHERE datname = 'production'
GROUP BY bucket;
-- Per-database/app/user load attribution
SELECT
datname,
application_name,
sum(calls) AS total_queries,
sum(total_exec_time) AS total_time_ms
FROM db_app_user_1m
WHERE event_time > now() - INTERVAL 1 HOUR
GROUP BY datname, application_name
ORDER BY total_time_ms DESC;
-- Error analysis with full context
SELECT
error_message,
count(*) AS occurrences,
groupArray(10)(query_text) AS sample_queries
FROM errors_recent
WHERE event_time > now() - INTERVAL 1 DAY
GROUP BY error_message
ORDER BY occurrences DESC;
Installation
Requirements: PostgreSQL 16, 17, or 18.
# Build from source
git clone https://github.com/ClickHouse/pg_stat_ch
cd pg_stat_ch
make
sudo make install
# Or use Docker for quick testing
docker-compose up -d
Configure in postgresql.conf:
shared_preload_libraries = 'pg_stat_ch'
# ClickHouse connection
pg_stat_ch.clickhouse_host = 'clickhouse.example.com'
pg_stat_ch.clickhouse_port = 9000
pg_stat_ch.clickhouse_database = 'postgres_events'
pg_stat_ch.clickhouse_user = 'default'
pg_stat_ch.clickhouse_password = 'secret'
# Tuning
pg_stat_ch.queue_capacity = 65536 # events in ring buffer
pg_stat_ch.flush_interval = 1000 # ms between flushes
pg_stat_ch.batch_size = 10000 # max events per flush
Restart PostgreSQL:
systemctl restart postgresql
Memory Footprint
The ring buffer uses predictable memory:
queue_capacity × event_size = total memory
65,536 × 4.6 KB = ~294 MB (default)
You can tune this based on your query rate and acceptable memory overhead.
Design Decisions for SREs
No Back-Pressure
If ClickHouse slows down or goes offline, PostgreSQL keeps running at full speed. Events are dropped rather than queued indefinitely.
Why? For an OLTP system doing 50K QPS, back-pressure adding 10ms stalls would push p99 from 5ms to 15ms. That breaks SLOs. Observability should observe, not obstruct.
The dropped counter is visible via pg_stat_ch_stats() so you can monitor overflow.
Fixed-Size Events
Query texts longer than 2KB get truncated. You still get the query_id fingerprint and all 45 metrics. For full text, look up by query_id in pg_stat_statements.
The tradeoff is predictable memory usage. Variable-length events would require dynamic allocation in the hot path.
Cache-Line Aligned Ring Buffer
The ring buffer separates producer-hot and consumer-hot fields onto different cache lines. This eliminates false sharing when many backends write concurrently.
Per-Transaction Batching
Under high concurrency, backends batch events locally and flush once per transaction. This reduces lock acquisitions from ~150K/s to ~30K/s on TPC-B workloads.
Overhead
The extension adds overhead primarily from:
memcpyinto ring buffer (fast)- LWLock acquisition (minimized via batching)
- Name resolution (
get_database_name,GetUserNameFromId) getrusage()for CPU timing
On typical OLTP workloads, expect 1-3% overhead. The team is still hardening for production but actively driving toward deployment in ClickHouse's managed Postgres offering.
Comparison with Alternatives
| Feature | pg_stat_statements | pg_stat_monitor | pg_stat_ch |
|---|---|---|---|
| Individual events | No | No | Yes |
| External storage | No | No | Yes (ClickHouse) |
| Historical drill-down | Limited | Limited | Full |
| Query plans | No | Yes | Not yet |
| Overhead | Low | Medium | Low-Medium |
| Failed queries | No | Yes | Yes |
pg_stat_ch complements rather than replaces pg_stat_statements. Use both: statements for quick aggregates, pg_stat_ch for deep analysis.
Use Cases
Incident Investigation
"What changed between 2 PM and 3 PM?"
SELECT
toStartOfMinute(event_time) AS minute,
count(*) AS queries,
avg(total_exec_time) AS avg_latency,
quantile(0.99)(total_exec_time) AS p99
FROM events_raw
WHERE event_time BETWEEN '2026-03-18 14:00:00' AND '2026-03-18 15:00:00'
GROUP BY minute
ORDER BY minute;
Regression Detection
Compare query performance before and after a deployment:
WITH before AS (
SELECT query_id, avg(total_exec_time) AS avg_time
FROM events_raw
WHERE event_time BETWEEN '2026-03-17 00:00:00' AND '2026-03-17 23:59:59'
GROUP BY query_id
),
after AS (
SELECT query_id, avg(total_exec_time) AS avg_time
FROM events_raw
WHERE event_time BETWEEN '2026-03-18 00:00:00' AND '2026-03-18 23:59:59'
GROUP BY query_id
)
SELECT
a.query_id,
b.avg_time AS before_ms,
a.avg_time AS after_ms,
(a.avg_time - b.avg_time) / b.avg_time * 100 AS pct_change
FROM after a
JOIN before b ON a.query_id = b.query_id
WHERE a.avg_time > b.avg_time * 1.5 -- 50%+ regression
ORDER BY pct_change DESC
LIMIT 20;
Load Attribution
Who's hammering the database?
SELECT
application_name,
usename,
client_addr,
count(*) AS queries,
sum(total_exec_time) AS total_time_ms
FROM events_raw
WHERE event_time > now() - INTERVAL 1 HOUR
GROUP BY application_name, usename, client_addr
ORDER BY total_time_ms DESC
LIMIT 20;
The Unified Stack Vision
pg_stat_ch is part of ClickHouse's broader vision: PostgreSQL for OLTP, ClickHouse for analytics and observability. Your transactional database stays focused on transactions. Your analytical database handles the heavy lifting of storing and querying billions of events.
This pattern works because:
- PostgreSQL is excellent at ACID transactions
- ClickHouse is excellent at columnar analytics
- Streaming events between them keeps both doing what they're best at
Conclusion
pg_stat_ch brings event-level observability to PostgreSQL without the overhead of traditional APM agents. If you run PostgreSQL at scale and want to understand exactly what your database is doing, this is worth trying.
The extension is Apache 2.0 licensed and actively developed. Query plans and longer query text capture are on the roadmap.
Running PostgreSQL in production? Akmatori AI agents can analyze your pg_stat_ch events to automatically detect performance regressions and suggest optimizations.
