Skip to main content
18.03.2026

pg_stat_ch: Stream Every PostgreSQL Query to ClickHouse

pg_stat_ch

Quick Reference:

# Clone and build
git clone https://github.com/ClickHouse/pg_stat_ch
cd pg_stat_ch && make && make install

# Enable in postgresql.conf
shared_preload_libraries = 'pg_stat_ch'
pg_stat_ch.clickhouse_host = 'clickhouse.example.com'
pg_stat_ch.clickhouse_database = 'postgres_events'

# Restart PostgreSQL
systemctl restart postgresql

ClickHouse just open-sourced pg_stat_ch, a PostgreSQL extension that captures every query execution as a fixed-size event (~4.6KB) and streams it to ClickHouse. This gives you APM-level observability for PostgreSQL without the overhead of traditional monitoring.

Why This Matters

Traditional PostgreSQL monitoring with pg_stat_statements gives you aggregated stats. You see total execution time and call counts, but you lose the individual events. When something goes wrong at 2:47 PM, you can't drill into exactly which queries ran during that minute.

pg_stat_ch changes this. Every SELECT, INSERT, UPDATE, DDL statement, and even failed queries become queryable events in ClickHouse. You get:

  • p50 to p99 latency over any time window
  • Top queries by runtime with full context
  • Error patterns by application, user, or database
  • Historical comparison across days or months

How It Works

The architecture is elegantly simple:

PostgreSQL Backend → Ring Buffer → Background Worker → ClickHouse
     (memcpy)         (shared)        (1s flush)      (native protocol)
  1. Hot path is fast: Each query completion triggers a memcpy into a shared-memory ring buffer. No network calls, no blocking.

  2. Background worker handles export: Once per second, a worker drains up to 10,000 events, compresses them with LZ4, and sends to ClickHouse via native binary protocol.

  3. ClickHouse does the heavy lifting: Raw events land in events_raw. Materialized views pre-aggregate into dashboards you can query immediately.

Event Schema

Each event captures 45 fields:

Category Fields
Timing execution time, planning time, lock wait time
I/O shared/local buffer hits and reads, temp blocks
WAL records, FPI, bytes
CPU user time, system time (via getrusage)
JIT functions, generation/optimization/emission time
Context database, user, application_name, client_addr
Query query_id fingerprint, truncated text (2KB default)

Pre-Built Materialized Views

pg_stat_ch ships with four materialized views in ClickHouse:

-- Rolling 1-hour window for real-time monitoring
SELECT * FROM events_recent_1h
WHERE query_id = 12345678;

-- 5-minute buckets with percentiles
SELECT
    toStartOfFiveMinutes(event_time) AS bucket,
    quantilesTDigest(0.50, 0.95, 0.99)(total_exec_time) AS latency_pcts
FROM query_stats_5m
WHERE datname = 'production'
GROUP BY bucket;

-- Per-database/app/user load attribution
SELECT
    datname,
    application_name,
    sum(calls) AS total_queries,
    sum(total_exec_time) AS total_time_ms
FROM db_app_user_1m
WHERE event_time > now() - INTERVAL 1 HOUR
GROUP BY datname, application_name
ORDER BY total_time_ms DESC;

-- Error analysis with full context
SELECT
    error_message,
    count(*) AS occurrences,
    groupArray(10)(query_text) AS sample_queries
FROM errors_recent
WHERE event_time > now() - INTERVAL 1 DAY
GROUP BY error_message
ORDER BY occurrences DESC;

Installation

Requirements: PostgreSQL 16, 17, or 18.

# Build from source
git clone https://github.com/ClickHouse/pg_stat_ch
cd pg_stat_ch
make
sudo make install

# Or use Docker for quick testing
docker-compose up -d

Configure in postgresql.conf:

shared_preload_libraries = 'pg_stat_ch'

# ClickHouse connection
pg_stat_ch.clickhouse_host = 'clickhouse.example.com'
pg_stat_ch.clickhouse_port = 9000
pg_stat_ch.clickhouse_database = 'postgres_events'
pg_stat_ch.clickhouse_user = 'default'
pg_stat_ch.clickhouse_password = 'secret'

# Tuning
pg_stat_ch.queue_capacity = 65536  # events in ring buffer
pg_stat_ch.flush_interval = 1000   # ms between flushes
pg_stat_ch.batch_size = 10000      # max events per flush

Restart PostgreSQL:

systemctl restart postgresql

Memory Footprint

The ring buffer uses predictable memory:

queue_capacity × event_size = total memory
65,536 × 4.6 KB = ~294 MB (default)

You can tune this based on your query rate and acceptable memory overhead.

Design Decisions for SREs

No Back-Pressure

If ClickHouse slows down or goes offline, PostgreSQL keeps running at full speed. Events are dropped rather than queued indefinitely.

Why? For an OLTP system doing 50K QPS, back-pressure adding 10ms stalls would push p99 from 5ms to 15ms. That breaks SLOs. Observability should observe, not obstruct.

The dropped counter is visible via pg_stat_ch_stats() so you can monitor overflow.

Fixed-Size Events

Query texts longer than 2KB get truncated. You still get the query_id fingerprint and all 45 metrics. For full text, look up by query_id in pg_stat_statements.

The tradeoff is predictable memory usage. Variable-length events would require dynamic allocation in the hot path.

Cache-Line Aligned Ring Buffer

The ring buffer separates producer-hot and consumer-hot fields onto different cache lines. This eliminates false sharing when many backends write concurrently.

Per-Transaction Batching

Under high concurrency, backends batch events locally and flush once per transaction. This reduces lock acquisitions from ~150K/s to ~30K/s on TPC-B workloads.

Overhead

The extension adds overhead primarily from:

  1. memcpy into ring buffer (fast)
  2. LWLock acquisition (minimized via batching)
  3. Name resolution (get_database_name, GetUserNameFromId)
  4. getrusage() for CPU timing

On typical OLTP workloads, expect 1-3% overhead. The team is still hardening for production but actively driving toward deployment in ClickHouse's managed Postgres offering.

Comparison with Alternatives

Feature pg_stat_statements pg_stat_monitor pg_stat_ch
Individual events No No Yes
External storage No No Yes (ClickHouse)
Historical drill-down Limited Limited Full
Query plans No Yes Not yet
Overhead Low Medium Low-Medium
Failed queries No Yes Yes

pg_stat_ch complements rather than replaces pg_stat_statements. Use both: statements for quick aggregates, pg_stat_ch for deep analysis.

Use Cases

Incident Investigation

"What changed between 2 PM and 3 PM?"

SELECT
    toStartOfMinute(event_time) AS minute,
    count(*) AS queries,
    avg(total_exec_time) AS avg_latency,
    quantile(0.99)(total_exec_time) AS p99
FROM events_raw
WHERE event_time BETWEEN '2026-03-18 14:00:00' AND '2026-03-18 15:00:00'
GROUP BY minute
ORDER BY minute;

Regression Detection

Compare query performance before and after a deployment:

WITH before AS (
    SELECT query_id, avg(total_exec_time) AS avg_time
    FROM events_raw
    WHERE event_time BETWEEN '2026-03-17 00:00:00' AND '2026-03-17 23:59:59'
    GROUP BY query_id
),
after AS (
    SELECT query_id, avg(total_exec_time) AS avg_time
    FROM events_raw
    WHERE event_time BETWEEN '2026-03-18 00:00:00' AND '2026-03-18 23:59:59'
    GROUP BY query_id
)
SELECT
    a.query_id,
    b.avg_time AS before_ms,
    a.avg_time AS after_ms,
    (a.avg_time - b.avg_time) / b.avg_time * 100 AS pct_change
FROM after a
JOIN before b ON a.query_id = b.query_id
WHERE a.avg_time > b.avg_time * 1.5  -- 50%+ regression
ORDER BY pct_change DESC
LIMIT 20;

Load Attribution

Who's hammering the database?

SELECT
    application_name,
    usename,
    client_addr,
    count(*) AS queries,
    sum(total_exec_time) AS total_time_ms
FROM events_raw
WHERE event_time > now() - INTERVAL 1 HOUR
GROUP BY application_name, usename, client_addr
ORDER BY total_time_ms DESC
LIMIT 20;

The Unified Stack Vision

pg_stat_ch is part of ClickHouse's broader vision: PostgreSQL for OLTP, ClickHouse for analytics and observability. Your transactional database stays focused on transactions. Your analytical database handles the heavy lifting of storing and querying billions of events.

This pattern works because:

  1. PostgreSQL is excellent at ACID transactions
  2. ClickHouse is excellent at columnar analytics
  3. Streaming events between them keeps both doing what they're best at

Conclusion

pg_stat_ch brings event-level observability to PostgreSQL without the overhead of traditional APM agents. If you run PostgreSQL at scale and want to understand exactly what your database is doing, this is worth trying.

The extension is Apache 2.0 licensed and actively developed. Query plans and longer query text capture are on the roadmap.


Running PostgreSQL in production? Akmatori AI agents can analyze your pg_stat_ch events to automatically detect performance regressions and suggest optimizations.

Automate incident response and prevent on-call burnout with AI-driven agents!